LDA v. LSA: A Comparison of Two Computational Text Analysis Tools for the Functional Categorization of Patents

Cvitanic, T.; Lee, B.; Song, H. I.; Fu, K.; Rosen, D.

Citation Details

One means to support for design-by-analogy (DbA) in practice involves giving designers efficient access to source analogies as inspiration to solve problems. The patent database has been used for many DbA support efforts, as it is a preexisting repository of catalogued technology. Latent Semantic Analysis (LSA) has been shown to be an effective computational text processing method for extracting meaningful similarities between patents for useful functional exploration during DbA. However, this has only been shown to be useful at a small-scale (100 patents). Considering the vastness of the patent database and realistic exploration at a large scale, it is important to consider how these computational analyses change with orders of magnitude more data. We present analysis of 1,000 random mechanical patents, comparing the ability of LSA to Latent Dirichlet Allocation (LDA) to categorize patents into meaningful groups. Resulting implications for large(r) scale data mining of patents for DbA support are detailed. more »

Award ID(s):: 1663204

PAR ID:: 10055536

Author(s) / Creator(s):: Cvitanic, T.; Lee, B.; Song, H. I.; Fu, K.; Rosen, D.

Date Published:: 2016-01-01

Journal Name:: International Conference on Case-Based Reasoning

Format(s):: Medium: X

Sponsoring Org:: National Science Foundation

Free Publicly Accessible Full Text
Accepted Manuscript1.0
Conference Paper:
The DOI is not currently available.

More Like this