Apr 29, 2008

Initial discussions

The first two discussions with Shalini Urs:
1st Discussion: Itentify problem: Took with me the following topics - patent mapping using text mining techniques, collaborative (collective) intelligence, open science and open innovation.

We finalized on the topic "Text mining". The reasons being, the topic seems doable and we know experts in the area we can get in touch with for guidence. We may use patents or any other corpus to demonstrate the research. The important thing is to develop a new method/alogrithm/technology and use a corpus to demonstrate it. We may use patents to accomplish this.

Target also: One paper per quarter

2nd Discussion:
Implementation:
- Download patents and time them to see pace (to estimate how much time is required to download the files)
- Build a test database out of these downloade patents
- Install the text mining software from the book by Manu K. and see how it works for the test database.
- Results could be published as a paper

Other than the practical implementation, I will continue to:
- Advance my knowledge on text mining techniques by reading books and papers
- Summarize the different techniques used for text mining

Apr 25, 2008

About Patent Mapping

Discussions with Anji:
Patent Mapping is an exercise that has been done for a long time. There are many tools avaiable for that. Maybe a field too saturated. The need to review literature thoroughly is essential.

A patent map over time may not tell much about a company's strategy. Decisions to patent an idea by a company or abandon patents is very complex, that may not be revealed by patent mapping.

Patenting and keeping a patent "alive" is also financially expensive to a company. Only the best ideas, or those that may have financial implications are patented. Defensive publications of ideas that are only "good" is an alternative to reduce this cost. And technically, even articles published in public domain by a company are also the company's IP.

After the research proposal formulation, it would be good to meet experts in the field. Potential people:
IPR Cell, IISc
A patent lawyer
Other researchers in this area

Apr 24, 2008

References

1. Tseng, Yuen-Hsien; , Lin, Chi Jen; Lin, Yu-I, Text Mining Techniques for patent Analysis in "Information Processing and Management, Vol. 43 (2007) 1216-1247.
http://www.sciencedirect.com/science?_ob=ArticleURL&_udi=B6VC8-4MX54T9-4&_user=10&_rdoc=1&_fmt=&_orig=search&_sort=d&view=c&_acct=C000050221&_version=1&_urlVersion=0&_userid=10&md5=58e329e1a40fb0f890eea4f0140363b7

Text Mining Book

Text Mining Application Programming by Manu Konchady
About the book: A practical approach to text mining with code on a CD. Useful to get a quick and working insight if you are new to the field.

More details:
Paperback: 432 pages
Publisher: Charles River Media
ISBN-10: 1584504609
ISBN-13: 978-1584504603
At Amazon: http://www.amazon.com/Text-Mining-Application-Programming/dp/1584504609

Possible Topics

Here are some the possible topics we've been exploring:
1. Patent Mapping using text mining techniques
- Show how a company's IP holdings are - strength in which technologies
- How has the company's IP map changed over time?
- How an external event (like a takeover of a company) changed it's IP portfolio? Has it changed the direction of the company's portfolio aftersome time (do they continue to build upon it)?

2. Automatic Classification of documents to a standard ontology (using text mining)
- Use Vidyanidhi's Thesis collection to classifiy it into the standard ontologies
- Classify IBID news articles into the SIC

Feb 20, 2008

Topics to Study

State of art of the following topics

  • Text mining techniques
  • Statistical methods
  • Information visualization techniques
  • Programming languages
  • WordNet