Last month, a most unusual AWS re:Invent came to an end. This year featured none of the crowds, ultra-marathon walks, or nightlife we have come to expect from years prior. It was, however, jam-packed with information on new services being offered by AWS.
At Atrium, our relationship with AWS has always been, and will likely continue to be, very focused on AI and machine learning — and services that enable us to build predictive models for our customers. It is a bit akin to walking into a five-story department store and spending all of our time in the electronics section, tucked away in a corner. As a small but growing team, we have to stay close to our mission and keep our eyes on the specific value that we bring.
With that focus during re:Invent, we saw AWS making advancements across three major areas: data wrangling, machine learning operations (MLOps), and its collaborations with Salesforce. My colleagues and I spent some time exploring those areas, as well as major themes from re:Invent and the tools and opportunities that are most exciting for us and our customers.
The preparation of data for analysis by data scientists continues to take up a large portion of the time needed (sometimes the majority of it) to prepare predictive models. The data science community is still in its “toddler” stage regarding industrial-grade processes and tooling. AWS is doing their part to advance these efforts with some of their new additions to SageMaker Studio. SageMaker Data Wrangler appears to be a compelling set of tools for data science teams to gather, transform, and visualize data.
In addition, the newly announced Clarify provides tools to identify and help teams deal with statistical bias in models. As a team, we are focused on eliminating bias from our predictions (statistical or otherwise) and this is a tool we are planning to use in that pursuit.
MLOps has been a hot topic at Atrium. It is still a relatively new concept, but there is a strong drive to push the same level of rigor and automation into MLOps that we have traditionally done with software development. SageMaker Pipelines is an interesting addition to Studio, as it may be the platform to build on.
It was also interesting to see that AWS has created their own machine learning-focused processor, Trainium, and machine learning-focused Trainium EC2 instance types. With the current spike in Bitcoin, other crypto currencies, and associated run on GPUs from NVidia and AMD, this might hopefully bring down the (hopefully short-term) costs associated with very large data set processing.
Along these lines, we have the introduction of Sagemaker Feature Store, which looks to allow for greater collaboration and reuse of feature sets across models. Rather than starting from scratch, or grabbing a previously used query from an earlier model, it should be a big improvement to reuse sets with a click or two within Studio. We are also hoping that this will lead to better collaboration within larger data science teams by creating a single, canonical view on the features.
In talking with Atrium’s MLOps expert, Rick Arnett, I’m not the only one interested in these new features: “I’m excited about MLOps, specifically Amazon taking more ownership of MLOps pipeline and extending capabilities with new products (e.g., Feature Store, Workflows, AWS Glue Databrew, Data Wrangler, etc.).”
One of Atrium’s data scientists, Chris Barbour, Ph.D., is also intrigued about the possibilities around Feature Store:
- It provides a single source of truth.
- Reusability across all machine learning models.
- Faster model development, as pre-engineered features are always available.
- Allows for “time travel” — possibly to construct exact feature data from a specific point in time and reconstruct full datasets from historical events.
Collaboration Between AWS and Salesforce
One of our pillars to success is CRM. We absolutely want to be the best in the world at providing insights to our customer’s data, but we generally look to the Salesforce platform’s data as a major input into, and the means of taking action on, those insights.
The re:Invent session “Your Fastest Path from Idea to Impact” gave more insight into the strategic partnership between AWS and Salesforce. Some of the key areas of collaboration:
- AWS Appflow can be used to securely move data between Salesforce and AWS without the need to manage API infrastructure.
- Running Mulesoft and Tableau on the AWS Cloud.
- Machine learning capabilities around Service Cloud Voice with Amazon Connect, native within Service Cloud Voice. This provides integration with prebuilt machine learning models — such as sentiment analysis — to assist in agent coaching and productivity.
- Joint learning initiatives hosted on the Trailblazer learning platform, with over 100K AWS-focused badges earned on the platform within the last year.
Data scientists have an insatiable appetite for data. Most often when we engage with companies eager to take advantage of machine learning, we need to spend time with them in order to create and nurture new clean data sources to extract signals.
While not new to re:Invent in 2020, Salesforce Service Cloud Voice continues to be something that we have interest in, as it has the potential to gather significant information for customer retention models and efforts. Call centers can be a fertile ground for gathering useful information on a company’s relationship with its customers. This is an area of interest for us at Atrium, especially when used as part of Salesforce’s Service Cloud Connect, which is a product partnership with Amazon. This data will be very useful in customer retention models and processes, but has to be done in a way that is mindful of ever-growing privacy concerns.
Keep the Momentum After AWS re:Invent
Though 2020 re:Invent was unlike any other, we still saw the latest advances in AWS technologies and got insight into new products and opportunities for collaboration with AWS and Salesforce.
Learn more about Atrium’s analytics and AI expertise and the services we offer.