#
document-similarity
Here are 35 public repositories matching this topic...
Compute Sentence Embeddings Fast!
cython
embeddings
gensim
fse
fasttext
word2vec-model
maxpooling
document-similarity
wordembedding
sif
sentence-similarity
sentence-embeddings
sentence-representation
usif
gensim-model
swem
-
Updated
Aug 5, 2020 - Python
Telegram Data Clustering contest solution by Mindful Squirrel
-
Updated
Aug 23, 2020 - HTML
A Clojure library for querying large data-sets on similarity
clojure
lsh
similarity
collaborative-filtering
minhash
data-sketching
recommender-system
lsh-forest
jaccard-similarity
data-sketches
cosine-distance
minhash-lsh-algorithm
document-similarity
plagiarism-detection
similarity-search
hamming-distance
-
Updated
Feb 17, 2019 - Clojure
Web Application for checking the similarity between query and document using the concept of Cosine Similarity.
flask
cosine-similarity
python-flask
plagiarism-checker
document-similarity
plagiarism-detection
python-project
-
Updated
Jul 29, 2020 - Python
Document Search Engine Tool
search-engine
scrapy-spider
indexer
scrapy
text-summarization
search-algorithm
webcrawler
latent-dirichlet-allocation
bm25
spellchecker
document-similarity
wikipedia-search
wikipedia-crawler
ranking-algorithm
document-summarization
reverse-index
-
Updated
Jul 4, 2020 - Python
-
Updated
Jun 12, 2020 - Jupyter Notebook
Document Search Engine project with TF-IDF abd Google universal sentence encoder model
python
data-science
machine-learning
deep-learning
tensorflow
text-analysis
semantic-search-engine
tensorflow-tutorials
tfidf
semantic-search
tensorflow-models
text-search
document-similarity
document-search
juypter
tfidf-text-analysis
text-semantic-similarity
universal-sentence-encoder
tfidf-vectorizer
python-text-analysis
-
Updated
Apr 21, 2020 - Jupyter Notebook
Using Jaccard-Similarity and Minhashing to determine similarity between two text documents
-
Updated
Mar 3, 2018 - Jupyter Notebook
Compilation of Natural Language Processing (NLP) codes. BONUS: Link to Information Retrieval (IR) codes compilation. (checkout the readme)
regex
word2vec
spacy
edit-distance
generative-model
ner
doc2vec
pos-tagging
document-similarity
word-similarity
hidden-markov-models
hmm-viterbi-algorithm
nlp-tools
discriminative-model
-
Updated
Jul 15, 2020 - Python
-
Updated
Jun 7, 2019 - Python
WebApplication for Similarity between Professor and Keyword based on WordEmbedding
-
Updated
Jan 31, 2019 - Jupyter Notebook
Rust-based text search engine from scratch supporting multiple document similarity metrics (TF-IDF, BM25, BM25VA)
-
Updated
Apr 22, 2020 - Rust
A tool which can find your any document using semantic search
python
search-engine
natural-language-processing
database
sqlite-database
pandas
web-scraping
semantic-search
pos-tagging
relevant-search
document-similarity
rank-bm25
-
Updated
May 14, 2020 - Python
The Bitnation Jurisdiction Public Notary DApp
-
Updated
Sep 28, 2018 - JavaScript
Telegram Data Clustering Contest (Bossy Gnu)
-
Updated
Aug 1, 2020 - C++
Code to train a LSI model using Pubmed OA medical documents and to use pre-trained Pubmed models on your own corpus for document similarity.
python
natural-language-processing
pubmed
medical-information
document-similarity
latent-semantic-analysis
topic-modelling
-
Updated
Feb 17, 2019 - Python
This is a program used to check document similarity using Natural Language Tool Kit,using Cosine Similarity.
-
Updated
Aug 9, 2018 - Python
My Bachelor Thesis in Computer Science, FER, University of Zagreb
-
Updated
Jul 10, 2018 - TeX
Document similarity algorithms experiment - Jaccard, TF-IDF, Doc2vec, USE, and BERT.
algorithm
deep-learning
tf-idf
jaccard
bert
new-york-times
document-similarity
universal-sentence-encoder
-
Updated
Aug 11, 2020 - Python
A system for automatic tagging of metadata of theses and dissertations from Bicol University
-
Updated
Sep 1, 2018 - Python
Natural Lang processing scripts
-
Updated
May 29, 2018 - Jupyter Notebook
Document searching from queries using Inverted index
-
Updated
Apr 30, 2018 - Python
Simple document similarity module implemented in NodeJS
-
Updated
Jan 20, 2018 - JavaScript
Document similarity using cosine distance, tf-idf, and latent semantic analysis.
-
Updated
Feb 15, 2017 - R
Classifying news articles with deep learning to build an automatic newsletter
-
Updated
Jul 19, 2018 - Jupyter Notebook
was curious about how plagiarism checker works, ended up learning about something completely different 😂
-
Updated
Aug 13, 2020 - Python
Document Similarity with Apache Spark using Locality Sesitive Hashing and Python
-
Updated
Mar 26, 2020 - Jupyter Notebook
Improve this page
Add a description, image, and links to the document-similarity topic page so that developers can more easily learn about it.
Add this topic to your repo
To associate your repository with the document-similarity topic, visit your repo's landing page and select "manage topics."
The usage example in the word2vec.py doc-comment regarding
KeyedVectorsuses inconsistent paths and thus doesn't work.https://github.com/RaRe-Technologies/gensim/blob/e859c11f6f57bf3c883a718a9ab7067ac0c2d4cf/gensim/models/word2vec.py#L73
https://github.com/RaRe-Technologies/gensim/blob/e859c11f6f57bf3c883a718a9ab7067ac0c2d4cf/gensim/models/word2vec.py#L76
If vectors were saved to a tm