• 5 years’ experience in data Science/Data analyst/Data engineering. Have experience in Data mining, Machine learning, natural language processing (NLP) using Python and R. Passionate about learning new things with can do approach.
• 10 years’ experience in software development using in ASP.Net and C#, MVC and Angular JS, REST API.
• Master of Data Science from the University of Sydney.
• Extensively worked on Data collection, cleansing, transformation, merging data from different source and different format, extract hidden patterns from that data, provide recommendation and special advice based on data.
• Experienced in data profiling, source-target mapping, restructure data, redesign the database using different database normalisation technique.
• Strong mathematics and statistics background and experience in unstructured data processing (text and image), big data, machine learning, data mining, neural network and natural language processing (NLP).
• Passionate about exploring data science technology to take decision making for any business, building prediction and classification model and Data visualization using Tableau and Microsoft Power BI. Versatile knowledge of SQL (data extraction and manipulation), expertise in software development skills in one to several languages (Python/R)
• Advanced Excel skills, including macros, VBA & ODBC connectivity for the purpose of data manipulation and presentation.
• Experienced in experimental design, hypotheses testing and building models.
* Send inquiry to this professional or invite him/her to bid for your posted needSend Message
Build Conversational Chatbot
Implement a basic (chit-chat) chatbot using Sequence to Sequence (seq2seq) model and Word Embeddings using Microsoft bolt building dataset. The task here is to build a basic chatbot that will answer the user question. The chatbot is trained using Microsoft chitchat question and answer data. After training the bot using that data, the bot answers the user questions. Tools and technology used here: I used here different NPL technology including Word2Vec, TfIdf, tokenization, stemming, lemmatization, and other different functions from NLTK and Spacey library.
Environment: NLP with deep learning, Python, TensorFlow
Computational prediction of Prostate Cancer by Machine Learning
I build some good quality machine learning and deep learning models to predict high- risk prostate cancer using only blood results (without doing biopsy). I obtained a dataset of 35,875 patients from the screening arm of the National Cancer Institute’s (NCI) Prostate, Lung, Colorectal, and Ovarian Cancer Screening Trial. I segmented the data into instances without prostate cancer, instances with low-risk prostate cancer, and instances with high-risk prostate cancer. I developed a pipeline to deal with imbalanced data and proposed algorithms to perform pre-processing on such datasets. I evaluated the accuracy of various machine learning algorithms in predicting high-risk prostate cancer. An accuracy of 91.5% can be achieved by the proposed pipeline, using standard scaling, SVMSMOTE sampling method, and AdaBoost for machine learning.
Tools and technology used: Different machine learning model including, Multi-layer Neural Network, Logistic Regression, KNN, Random Forest, AdaBoost, XGBoost, Gradient Boost, LDA and SVM.
Environment: Python and Jupiter Notebook
Master of Data Science
Unit of study name:
1. Principles of Data Science
2. Machine Learning and Data Mining
3. Computational Statistical Methods
4. Natural Language Processing
5. Deep Learning
6. Visual Analytics
7. Information Technology Capstone Project