OpenRefine is a free, open source power tool for working with messy data and improving it
#
datacleaning
Repositories 49
用配置替代编码完成数据清洗
Java
Updated Jan 28, 2019
对通达信数据进行去重和清洗处理,并将数据存入MongoDB,方便往后研究
Python
Updated Mar 30, 2018
machine-learning
ml
nlp-machine-learning
time-series
data
datacleaning
naive-bayes-classifier
folium
plotly
portfolio
HTML
Updated Sep 2, 2018
Examples for Optimus a Data Cleansing Library for Big Data.
Updated Oct 24, 2017
All kaggle datasets and the R codes
HTML
Updated Mar 12, 2019
Spark-lean, an interactive PySpark-based Data Cleaning Library
My fictitious firm, GDSMC Global, is a security consultancy focusing on supporting governments around the world in un…
datacleaning
imputation
visualization
correlation-matrices
predictive-modeling
logistic-regression
cleaning
r
business-solutions
consulting
deployment
R
Updated Feb 28, 2018
Machine Learning Project on Imbalanced Data in R
machine-learning
support-vector-machine
xgboost-algorithm
naive-bayes-algorithm
imbalanced-learning
oversampling
undersampling
smote
hypothesis-testing
dataexploration
datacleaning
feature-engineering
R
Updated Feb 24, 2018
This analysis mainly aims to find a way to decide which one of these clients without financials have actually over 5 …
Jupyter Notebook
Updated Mar 6, 2019
The course material from multiple sources
Jupyter Notebook
Updated Mar 17, 2019
Text Preprocessing
Python
Updated Nov 12, 2017
Great Milad Motion Repository Files & Projects
machinelearning
classification
clustering
nlp
textmining
front-end
webdesign
softwaredevelopment
datascience
python
back-end
prediction
api
nodejs
anomaly-detection
datacleaning
Jupyter Notebook
Updated Aug 1, 2017
Datathon sponsored by the City of Los Angeles and Accenture to forecast the payroll data and optimize the inefficient…
R
Updated Nov 19, 2018
Learning Material for the Computer Language Workshop, Fall 2018, Economics Department, The New School for Social Rese…
HTML
Updated Oct 12, 2018
Jupyter Notebook
Updated Jun 21, 2018
Bu uygulama R yazılımının veri temizleme süreçlerinde nasıl kullanılabileceğini göstermektedir
Updated Mar 5, 2019
It's commonly said that data scientists spend 80% of their time cleaning and manipulating data and only 20% of their …
Updated Nov 16, 2018
Practicing Datacleaning with numpy and pandas.
Jupyter Notebook
Updated Dec 24, 2018
Excel Based Projects
Updated May 8, 2018
exploratory-data-analysis
datascience
datacleaning
datacleansing
imputation
r
rprogramming
rstudio
pca
pca-analysis
eda
dataframe
hackathon2017
HTML
Updated Nov 13, 2017
A List of the Exploratory Data Analysis Project I have done. Kaggle Kernels Link. https://www.kaggle.com/juli0703/ker…
eda
python
pandas
jupyter-notebook
kaggle-dataset
data-visualization
data-analysis
dataprocessing
datacleaning
Jupyter Notebook
Updated May 22, 2018
CSVParser is a tool to parse csv file using univocity and commons csv parsers. It cleans new line (\n) character & sp…
univocity
csv-parser
opencsv
datacleaner
newline
quotes
garbage-segregation
datacleaning
datacleansing
csvparser
Java
Updated Jan 19, 2019
R
Updated Sep 3, 2017
Using OpenRefine: Cleaning "University" and "Musician"data, visualizing through scatter plot
Updated Jun 28, 2018
R
Updated Feb 20, 2018
R
Updated May 4, 2018
Jupyter Notebook
Updated Aug 20, 2017
Running exciting analyses on interesting datasets is the dream of every data scientist. But first, some importing and…
Updated Nov 18, 2018
data
data-analysis
python
jupyter-notebook
prosper-loan-data
prosper
udacity
nanodegree
portfolio
fifa
analytics
eda
exploratory-data-analysis
explanatory-data-analysis
seaborn
datavisualization
datawrangling
datacleaning
Jupyter Notebook
Updated Feb 26, 2019