Machine learning at small organizations

en-usJanuary 17, 2023

Practical AI: Machine Learning, Data Science

Podcast Summary

From data science to business value in small organizations: Small organizations often require team members to handle various tasks, making ML projects unique. Start by converting processes from Excel to Python to demonstrate data science's power and potential in small businesses.
The role of a data scientist in organizations, especially smaller ones, is not just about building models but converting data into business value using data science techniques. Kirsten Lumb, the co-founder and CPO of Storytellers AI, shared her experience of starting in the field without a data science degree but through analytics in startups. She emphasized that small organizations often require team members to handle various tasks and figure out what needs to be done, making ML at small organizations unique. Kirsten's first data science project involved converting a marketing process from Excel to Python, significantly reducing the time spent and demonstrating the power of data science. Although her initial project was at a large company, the lack of access to data scientists made her realize the potential of data science in smaller organizations. Overall, the conversation highlighted the importance of data science in small organizations and the unique challenges and opportunities it presents.
Small organizations can make a big impact with data science: Small businesses can leverage data science and machine learning for growth despite perceived challenges like hiring, data readiness, and system integration
Even a small organization without extensive resources can make a significant impact by embracing data science and machine learning. This was demonstrated by a marketing analyst who, with her knowledge of Python and data analysis skills, was able to streamline processes and bring about growth in her business. However, many small organizations may feel daunted by the perceived challenges, such as hiring the right person, ensuring data readiness, and integrating data science into existing systems. These concerns can often be overestimated, and the data science community could do more to provide accessible resources and guidance for small businesses looking to make the transition. Ultimately, the potential benefits of data science, including increased efficiency and growth opportunities, outweigh the perceived obstacles.
The Role of a Data Scientist in a Small Organization: Data scientists in small organizations pull data together, build models, and explain how to integrate them into the business, a crucial role as data and technologies vary.
While low-code and no-code tools are making it easier for non-technical team members to perform certain tasks, there will always be a need for data scientists and analysts to reconcile and make sense of the data from various sources. The role of a data scientist in a small organization is to pull data together, build models, and explain how to integrate them into the business. This role is essential as data and technologies used by organizations vary, making it necessary to have someone who can reconcile and interpret the data. The use of tools like Jasper for content generation and libraries for specific tasks can augment the work of team members, but the role of a data scientist remains crucial in making sense of the data and guiding the organization. This role is not only necessary but also the most enjoyable part of data science for many, as it involves problem-solving and adapting to unique scenarios. The idea that BI tools would make the role of a Business Intelligence analyst obsolete did not hold true, and similarly, the role of a data scientist will continue to be essential in the future.
Broad understanding of data science workflow essential for small companies: Small data scientists should handle various aspects of data science process beyond model training, including data infrastructure, ETL, model deployment, and simple pipelines.
For data scientists or machine learning professionals at small companies, it's essential to have a broad understanding of the entire data science workflow rather than being an expert in just one area. Instead of focusing solely on model training, they need to be able to handle various aspects of the process, including data infrastructure, ETL, model deployment, and simple pipelines. This doesn't mean they have to be experts in MLOps or come up with new ways of doing it, but they should have a good grasp of the basics to deploy their models and manage simple pipelines. The discussion also touched upon the democratization of technology and how a 9-year-old building a drone from Lego pieces shows that complex tasks can be simplified, making it possible for more people to innovate and contribute in various fields, including open-source development.
Developing patterns for success in application building and management: Develop strong project management skills, focus on tabular data and gradient boosted trees, and have a clear baselining process to ensure progress and impact in application development
Building and managing applications, especially in a small business environment, requires a strong foundation in project management skills and a focus on simple, effective solutions. When bringing someone new into this field, it's important to provide them with clear patterns or recipes for success. One unintuitive but crucial pattern is to develop strong project management skills, including the ability to manage projects from start to finish and shepherd projects from the very beginning, even when data isn't yet in a database. Another important recipe is to focus on tabular data and use gradient boosted trees as a baseline model. Lastly, having a clear baselining process is essential to knowing when a model is good enough to move on to the next project, as multiple models can have a greater impact on a business than one perfect model. These patterns and recipes can help set someone up for success in the complex and ever-evolving world of applications.
Communicating Value in Small Businesses: In small businesses, data scientists must focus on delivering measurable results, communicate clearly, prioritize effectively, and collaborate with other teams to navigate unique challenges and ensure the value of data science is understood.
In small businesses, where strategies and priorities can change rapidly, data scientists must focus on delivering measurable results and maintaining clear communication with their teams and stakeholders. This can help mitigate the instability and ensure that everyone understands the value data science brings to the company. Additionally, having a well-defined prioritization framework and being flexible with changing roadmaps can help data scientists navigate the unique challenges of a small business environment. Furthermore, effective collaboration with other teams, such as software development or infrastructure, is crucial for successful implementation of data science projects. Ultimately, it's important to remember that people are the key to getting things done in any organization, and strong relationships and communication are essential for success in data science.
Building trust as a data scientist goes beyond technical skills: Understand org architecture, meet key people, and align work with their goals for stronger collaboration and successful projects. Use tools like Trello or Google Sheets for effective project management.
Earning trust within an organization as a data scientist goes beyond just technical skills. It's crucial to understand the organization's architecture, meet with key people, and identify how your work can help them achieve their goals. This can lead to stronger collaboration and more successful projects. When it comes to project management for data science, there's no one-size-fits-all solution. Some may prefer tools like Trello or Jira, while others might find success with simpler methods like Google Sheets. The key is to find a system that works for your specific needs and makes the project management process clear and accessible to everyone involved. For those starting out, Trello is a great option as it's shareable and offers templates for data science projects. Google Sheets is also a versatile tool that can be especially helpful for smaller teams or organizations without a well-established project management system. By experimenting with different tools and finding what works best for your team, you can streamline your project management workflow and build trust and collaboration within your organization.
Effective communication and strong relationships in data science projects: Clear communication and collaboration with stakeholders through simple project management frameworks and agile methodologies foster trust and ensure everyone is informed. Data science benefits all functions, so involving non-technical team members can lead to growth.
Effective communication and strong relationships are crucial for the success of data science projects in small organizations. While focusing on creating accurate models and pipelines is important, it's equally essential to prioritize downstream processes and relationships with stakeholders. Regular communication through simple project management frameworks and agile methodologies can help build trust and keep everyone informed. Additionally, it's important to remember that data science can benefit all functions within an organization, and bringing non-technical team members on board with a data-centric mindset can lead to significant growth. By being patient, clear, and persistent in communicating the benefits of data science, you can help create a culture that values and integrates data-driven decision making.
Demonstrating the Value of Data Science in Small Organizations: Small data scientists must build trust and educate colleagues through effective A/B testing, while also simplifying machine learning tech stack for easier deployment.
As a data scientist in a small organization, it's not just about delivering accurate results, but also about educating your colleagues about the benefits and impact of data science. This requires a strong A/B testing framework to demonstrate the product's value. Building trust within a small organization is crucial, and delivering results is the output of earning that trust. Small machine learning organizations have advantages over larger ones, as they often deal with simpler parts of the machine learning tech stack, such as batch tabular inference, which can be easier to learn and deploy. However, the responsibility of a data scientist in a small organization goes beyond just their work; they also represent the discipline within the company and must convince others of its value.
Impact of Company Size on Data Scientist Role: Smaller companies offer a broader perspective, but lack resources. Mid to larger-sized firms provide opportunities to learn from pros and acquire essential skills. Ultimately, choose based on personal goals and available opportunities.
The size of a company can significantly impact a data scientist's role and opportunities for growth. At smaller companies, data scientists may have the advantage of a broader perspective, as they might get to engage in various aspects of machine learning, including data engineering and MLOps. However, they may lack the resources and established processes found in larger organizations. For those just starting their careers in data science, it's generally recommended to join a mid to larger-sized company to learn from experienced professionals and acquire essential skills. Startups led by data science experts can also be excellent opportunities for mentorship. Ultimately, the choice between small and large companies depends on individual goals, career stage, and the specific opportunities available. Unfortunately, there's no comprehensive resource that covers end-to-end data science workflows, making practical experience a crucial aspect of mastering the field.
Exploring the potential of data science in small organizations: Understanding the entire workflow and observing processes in larger organizations can help implement effective data science techniques in smaller businesses. Exciting potential for user-friendly MLOps tools and measuring excellence by impact in small orgs.
To excel in data science, particularly in smaller companies, it's crucial to understand the entire workflow from data preparation to storytelling. Observing the processes of colleagues upstream and downstream in larger organizations can provide valuable insights for implementing effective data science techniques in smaller businesses. Kirsten, a data science leader, expressed her excitement about the potential of data science in small organizations, particularly in the development of user-friendly MLOps tools and the shift in measuring excellence by impact rather than just state-of-the-art performance. The future of data science lies in its ability to make a tangible difference in various industries, from education to universities, and the community's focus on creating practical solutions for small businesses.

Recent Episodes from Practical AI: Machine Learning, Data Science

Stanford's AI Index Report 2024

We’ve had representatives from Stanford’s Institute for Human-Centered Artificial Intelligence (HAI) on the show in the past, but we were super excited to talk through their 2024 AI Index Report after such a crazy year in AI! Nestor from HAI joins us in this episode to talk about some of the main takeaways including how AI makes workers more productive, the US is increasing regulations sharply, and industry continues to dominate frontier AI research.

Practical AI: Machine Learning, Data Science

en-usJuly 02, 2024

On this page

Machine learning at small organizations

Practical AI: Machine Learning, Data Science

Podcast Summary

Recent Episodes from Practical AI: Machine Learning, Data Science

Stanford's AI Index Report 2024

Apple Intelligence & Advanced RAG

The perplexities of information retrieval

Using edge models to find sensitive data

Rise of the AI PC & local LLMs

AI in the U.S. Congress

First impressions of GPT-4o

Full-stack approach for effective AI agents

Autonomous fighter jets?!

Private, open source chat UIs

Related Episodes

When data leakage turns into a flood of trouble

Stable Diffusion (Practical AI #193)

AlphaFold is revolutionizing biology

The nose knows

Zero-shot multitask learning (Practical AI #158)