In December 2019, a couple of the Qrious team headed over to Las Vegas for AWS re:Invent, the tech giant’s annual customer conference. With 65,000 attendees, it’s the biggest cloud computing conference in the world, and this year saw 77 product launches, feature releases and services announced. It’s a lot to take in - so here are the stories that excited our team the most.
A peek under the AWS hood
According to Innosight’s 2018 Longevity Forecast, the average lifespan of a Fortune 500 company has reduced from 35 years in 1964 to just 15 years in 2019. There’s no question that innovating and transforming at scale is key not just to success, but to continued survival in a competitive and disruptive environment. As the third largest company in the world, Amazon has continually built and reinvented itself – moving from selling books in 1994 to selling cloud computing (along with literally everything else) over the past 24 years.
Re:Invent attendees gained insight into what makes AWS so powerful in a leadership session on enterprise transformation. Joe Brigden, VP, AWS Managed Services and AWS, shared how they approach digital transformation by creating a culture of innovation, and modernising technology and methodologies to free resources. For Brigden, these include:
Two pizza teams: Customer obsessed, small teams that are empowered to own what they create (and small enough that they can be catered for with just two pizzas)
Architecture that supports rapid growth and change
But culture is only half of the story. Technical innovation and development is key to Amazon’s growth as well. In fact, “how does Amazon build?” is the question Charlie Bell, SVP for AWS is asked most by customers. At re:Invent Bell launched an extraordinary resource to the world; the Amazon Builders’ Library. It’s a living collection of articles from inside AWS, written by Amazon’s senior technical leaders and engineers, that explains their approach to innovation and development.
Bell believes that “there’s no question the world will be a better place if everyone can innovate more quickly and efficiently”, and this library, he says, is one way AWS can help organisations get ahead.
Much as we like to think that we’re speeding towards an entirely digital work environment, organisations with legacy systems, such as those in the finance, healthcare and public sectors, still rely on information locked in paper documents.
Adrian Lam is Data Scientist and AI Project Lead for Change Healthcare, a provider of revenue and payment cycle management and clinical information exchange solutions. At re:Invent, he described how the US healthcare system is expensive, complicated and largely manual. It generates huge amounts of waste, largely due to “humans doing repetitive tasks, making sub-optimal decisions.”
For Lam, one way to alleviate this waste is to find areas in the process to apply AI and automation. “AI isn’t meant to be perfect”, he argued, “it can’t solve all problems, but it can solve a fair number of them. We can have machines take care of the things that are very straightforward, that we know are not complex. Then we can route the things which are more complex to humans.”
So, Lam’s team looked for an opportunity to use AI to help improve Change Healthcare’s processes. They identified auditing complex medical documents, such as iBills, as a huge manual effort for their staff that could easily be rerouted to an AI process. “Medical records are messy,” Lam said. “They come in all sorts of formats, and if you can structure it, that makes applying machine learning to it much easier.”
They found a way to structure these documents in Amazon Textract. It detects and extracts text, table and form data from documents using AI and Optical Character Recognition (OCR), to read documents as a person would. Now, a scanned image PDF iBill is processed through Textract for OCR and table extraction. A little data reformatting takes place, and the information is sent to Change’s operations team for human validation, removing the large scale manual task of interpreting and processing the documents.
It's streamlined, quick and simple, and can scale to the volume that Change Healthcare needs. It provides for multiple output data formats and options for manual intervention process based on rules.
How Warner Bros. has extended analytics beyond their data warehouse
Warner Bros. Interactive Entertainment is a powerful force in interactive entertainment – across console, handheld, mobile and PC-based gaming. The company’s gaming platforms generate a high volume of streaming data – so putting it to work to derive analytics and insights was a no-brainer. But, for analytics to take place, the data needed serious transformation and augmentation. Further complicating this project, Kurt Larson, Technical Director Analytics for Warner Bros, said, was the sheer volume of data made it uneconomical to process and store in their data warehouse, while other solutions couldn’t process with the scale and performance they needed.
So, using AWS Redshift Spectrum, Warner Bros extended their Redshift data warehouse out into the data lake. To do this, the high-volume raw data was landed into the S3 Operational Lake. They use the Redshift Spectrum service to query data in the Operational Lake with Redshift, then transform and augment with data from the Redshift data warehouse and place it into their S3 Analytical Lake in Parquet format.
“We’ve harnessed Amazon Redshift’s ability to query open data formats across our data lake with Redshift Spectrum since 2017, and now with the new Redshift Data Lake Export feature, we can conveniently write data back to our data lake,” Larsen says. “This all happens with consistently fast performance, even at our highest query loads. We look forward to leveraging the synergy of an integrated big data stack to drive more data sharing across Amazon Redshift clusters, and derive more value at a lower cost for all our games.”
Machine learning is going to transform cancer research
Raphael Gottardo, Scientific Director, Translational Data Science Integrated Research Centre, spoke about how the Fred Hutch Cancer Research Centre is using machine learning to calculate the likelihood of a patient reacting positively to Immunotherapy Treatment.
T cells and the immune system are the best weapon we have against cancer, but all bodies are different. T cells are lymphocytes that can target and kill viruses and cancer cells. Only one in every 100,000 T cells respond to immunotherapy so for researchers, identifying these unique cells are the key to curing cancer. Previously, this was a manual process – which was time-consuming and introduced bias into the dataset, meaning that information that might be crucial to helping immunotherapy could be left on the table.
The solution? Gottardo’s team developed a machine learning method, FAUST, to perform robust, unbiased cell discovery and annotation. However, the method was computationally taxing, and for their research to be meaningful, they needed to study thousands of samples – millions of cells. So, they used AWS Batch to scale the method to clinical sized data sets, vastly improving the speed and accuracy of cell data analysis – accelerating targeted immunotherapy.