https://www.techtarget.com/whatis/feature/8-data-science-projects-to-build-your-resume
Writing a specific resume to apply for a data science position is no easy task. However, it is necessary, as applicants need to submit resumes for any open data science position. A well-written resume is the most critical component of getting an interview for a job as a data scientist.
A good data science resume should be brief -- typically, just one page long, unless the applicant has many years of experience. The sections of the data science resume should include:
These sections help applicants demonstrate their backgrounds and knowledge in relevant areas.
Organizations looking to hire data scientists expect candidates to have either some previous work experience or, alternatively, data science-related projects. Job seekers transitioning to careers in data science right from college, switching careers or seeking different types of data science jobs can use projects to show prospective employers they have the necessary skills to do the work. A data science project portfolio should include three to five projects that showcase the applicant's relevant skills.
Here are eight data science projects to build your resume.
Today, data-driven companies use sentiment analysis to identify customers' attitudes about their products or services. Sentiment analysis is the automated process of determining if opinions toward a product or service are positive, negative or neutral. Normally, this is expressed in pieces of text.
The objective of sentiment analysis is to help a company figure out the answers to questions such as:
Customer opinions can range from positive to negative, and the range of responses can be classed as positive, negative or multiple -- i.e., excited, angry, happy, sad or another emotion.
This sentiment analysis data science project could be implemented in the R language, using the "janeaustenR" package or data set. For this project, the job candidate will use general-purpose lexicons, including:
The applicant can then build a word cloud to display the results.
Face detection, a method to distinguish a person's face from other parts of the body and the background, is a simpler undertaking and can be considered a beginner-level project.
The objective of face detection is to determine if there are any faces in an image or video. If there is more than one face in the image or video, each face is enclosed by a bounding box. A job applicant should be able to build a simple face detector using Python. Building a program that detects faces is a great way to get started with computer vision.
The module library used for this project is called the Open Source Computer Vision Library (OpenCV), an open source computer vision and machine learning library with a focus on real-time applications.
Face detection is one of the steps needed for facial recognition, the procedural recognition of a person's face along with the user's authorized name. The best method for facial recognition is to use deep neural networks.
After a face is detected, deep learning can solve face recognition tasks, using such transfer learning models as VGG16 architecture, ResNet50 architecture and FaceNet architecture. These make it easier to build deep learning models, enabling users to build high-quality face recognition systems. Users can also build their own deep learning models to build face recognition systems. Face recognition models can be used in security systems and surveillance, for example.
Spam detection is a classic data science problem, as organizations need to monitor their communication channels for spam emails and messages to ward off data security threats. Google, Yahoo and other major email providers implement spam detection algorithms to handle the threats posed by spam emails.
Training a model to detect spam messages and spam emails is another project for data science applicants to use to build their resumes.
Project: Spam classification
Tools: Scikit-learn, Spacy, NLTK, Python
Data set: SMS Spam Collection Dataset from Kaggle
Using data to provide insights, tell stories and convince people of something is an important part of a data science job. What good is doing a top-notch analysis if the CEO doesn't understand it or take action based on it?
This data science project should enable laypeople, such as hiring managers with little coding or statistical backgrounds, to draw the appropriate conclusions. Data visualization and communication skills are important for this project to show and explain the applicant's code.
One example is doing a data visualization project using ggplot2 (a data visualization package for the statistical programming language R) and its libraries to analyze certain parameters, such as the number of trips a Boston Uber driver makes in one day, one month, three months, six months or 12 months. The applicant will use Uber pickups in the Boston data set, for instance, and create visualizations for the different time frames of the year. This reveals how time affects customer trips.
Project: Uber data analysis project in R
Language: R
Data set: Uber pickups in Boston
A recommender system, a platform that uses a filtering process, offers users various content based on their preferences. A recommender system inputs information about the user, evaluates those parameters using a machine learning model and returns recommendations -- for example, with movie recommendations.
A movie recommendation can be based on input received from people who have seen a particular film. Their responses can classify a movie as funny, boring, interesting, exciting or even a waste of time.
There are two types of recommender systems:
Netflix, for example, recommends movies or shows that are similar to a user's browsing history or movies that other users with similar browsing histories have watched in the past.
Project: Movie recommendation system project in R
Language: R
Data set: MovieLens dataset
This data science project is great for beginners. Optical character recognition (OCR) uses an electronic or mechanical device to convert two-dimensional text data into a form of machine-encoded text. Computer vision can be used to read the text files or image. After reading the image, use the Python-pytesseract module (an OCR tool for Python) to read the text data in the PDF or image. Then convert the text data into a string of data that can be displayed in Python.
Once data science job applicants thoroughly understand how OCR works and the necessary tools, they can compute more complex problems, such as using sequence-to-sequence attention models to convert the data the OCR reads from one language into another.
Time series prediction is the study of how metrics behave over time. The time series technique is commonly used in data science with a wide range of applications, including weather forecasting, predicting sales, analyzing annual trends and analyzing website traffic.
The increase in traffic to a website can be a major problem for a company, as it can cause the site to load slowly or crash entirely. Predicting the website traffic can enable the company to make better decisions to control the congestion.
Project: Web traffic time series forecasting
Tools: Google Cloud Platform
Algorithms: Recurrent neural networks, long- and short-term memory, autoregressive integrated moving average-based techniques
Data set: The data set consists of 145,000 time series, representing the number of daily page views of different Wikipedia articles.
One of the key decisions data science job applicants have to make is what data to analyze with any project.
Here are some websites where applicants can find data to work with.
The best projects to showcase are ones that can be presented succinctly. A well-constructed description of the project can be presented in a few sentences to a paragraph.
When adding data science projects to a resume, applicants should include:
Although many recruiters and hiring managers will follow links and look at candidates' project presentations on their websites or portfolio sites, some will only look at a candidate's GitHub.
As such, applicants should know the basics of GitHub and be familiar with Git -- a version control system they can use to manage and keep track of their source code histories.
Data scientists are in high demand. Consequently, there's enormous potential for growth in this field for skilled professionals. To break into the field of data science, job applicants must impress prospective employers by showcasing their skills and expertise. They can demonstrate they have the necessary skills by adding data science projects to their resumes.
08 Jun 2021