There are a lot of challenges in the real world problem that students don’t necessarily face at the University. In school, they used to get a structured problem and a popular dataset and eventually get the exact solution. However, the problem in the industry will often be unstructured and complex. Any assumptions on the problem will backfire in the real world. It is better to understand the business problem completely before diving into the analysis. Understanding business problems involves doing more research on the problem and its domain, planning, asking the clients the right questions and discuss with team members.
Data science is about logical thinking, generating more ideas and creativity in solving the problems. Hence, teamwork plays an important role in data science. It is also necessary to think multidimensionally rather than one dimensional. Team members could be coming from diverse backgrounds with different kinds of skill sets. Take the strength of each team member and distribute the work accordingly. This helped me to solve the problem in different ways and learn new things.
Data science is about sharing and collaboration. Basically, you need to understand the views of others in the team. Many times, other team members come up with good ideas, and it's important to understand them properly to to successfully implement it in the project. As I said above, data science is not a one-man show and it is always a team effort.
Data science or AI is a fast-evolving field, and as a result, there will always be something new and crucial to learn. It is very hard to remember everything and documentation helped me to overcome this challenge. Also, It helped me to crystallise my own thought process. I document my learnings, analysis, model process, experiments and the code. Also, I write up failed experiments and reasons in a detailed manner, to help sharpen my ideas in the long run. Proper documentation has helped me to improve my communication and understand concepts in detail. You can document even small things that you learned or come across that make a big difference in the long run. Use your own tools to document.
Working in an agile environment gives me clear planning, prioritisation, and direction at the start of each sprint. Having an agile mindset helps in responding to change and handling uncertainty. If you come across uncertainty, try out options, collect feedback and improve iteratively. It also gave me an opportunity to collaborate with different teams. Presenting a minimum viable product (MVP) in the form of a machine learning model at the end of each sprint to stakeholders helped me to shape my projects in a better form. Also, feedback from the end of each sprint helped me to correct my mistakes and deliver the project efficiently.
Storytelling is an important part of data science. We are crunching the data and creating a model, and finding the insights. But, what does this model says in business terms? In other words how this model generates money for the company or solve the problem? Stakeholders and management are not interested in p-value or any other statistics. The main challenge here is explaining the model in simpler terms to a non-technical audience in an engaging manner. One way to explain the model via a short story. This is one of my biggest learning in the last year. Always, include good visualisation and it helps to convey the message as a story. Storytelling is an art and it takes time and a lot of practice.
We always use traditional PPT for showing our work to the clients or stakeholders. Instead of PPT why don’t we create a web app or dashboard to explain our model output? Creating a web app or dashboard shows commitment to the project and also get connected with stakeholders and clients.
Version control is an important thing that everyone includes in the workflow. It helps to manage your codes centrally rather than saving it into PC/Laptop or external drive. This way, you can refer to the code or documents whenever you are working on a new project at any location.
I significantly improved my coding skills during the last 8 months. One thing I have learned in my work as well as in competitions is to write functional or object-oriented code to have maximum code reusability. This will help to use the code in future projects as well as reduces time in the current one. I used to document the code function whenever I referred to stackoverflow or google and this helped me to learn new things on coding. Always follow best practices and keep your code reader-friendly.
Data science is a blend of computer science, statistics, machine learning, and domain expertise. Hence it is required to have skills from handling different steps from cleaning data to interpreting the final model and deploying it. Don’t be intimidated, you can’t master data science in one day. So if you get into a difficult situation, feel free to ask for help, through which you will gain more knowledge and eventually make you confident about your approach.
AI is the new buzz in the IT industry, and you won't be able to get up to speed on all of it in a short period of time. Decide to take it strategically by investing one or two hours every day to learn new concepts and solve new problems which will include learning a new algorithm, coding, reading a blog, doing personal projects etc. Apart from all this, I would highly recommend reading non-technical books that help a lot on the flow and storytelling technique which will be a useful trait as we move on.
During my initial days, I was under the impression that in this analytical world everyone is a master of everything. But later I realised that my assumption was wrong. I understood its a continuous learning process for everyone here. The core thing to stay current on this game is the passion, curiosity, and thirst to learn more. Be it machine learning or deep learning or NLP, it is always the passion that solves complicated problems.
Dhilip is a Qrious data science intern and member of Q.Lab, our innovation hub.