Melpomene - Fotolia

Web integration platform eases way to machine learning models

StoryFit data scientists employ machine learning algorithms to gauge film script scenarios' prospects. They use tools to make data preparation easier.

Even Hollywood is looking to AI these days.

For the data scientists building the machine learning models that help studios evaluate scripts, deciding whether to focus on creating the algorithms or to also spend time preparing the data to feed the models was a big issue, according to Monica Landers, CEO of StoryFit, a startup based in Austin, Texas.

StoryFit data scientists use analytics to help gauge whether storylines for books, films and TV shows are likely to garner an audience that justifies the effort.

"This is a different way to look at content. We apply data to storytelling," Landers said.

What Landers described is a mix that relies on text, demographic, social media sentiment and other data -- much of it residing on the web -- that StoryFit couples with natural language and AI processing to create machine learning models.

While Netflix has been using analytics to fine-tune content programming for some time, it is generally still early going for such efforts. As the cost of entertainment programming has continued to rise, so has interest in data analytics for script selection and doctoring and for promoting productions after launch.

How StoryFit predicts narratives' success

Monica Landers, CEO, StoryFitMonica Landers

Studios use analytics based on the machine learning models to predict how scripts and manuscripts will fare as they are turned into films, books and so on, as well as to cite areas where writers can make improvements.

Landers said StoryFit has built machine learning models that understand story elements. StoryFit ingests and maps whole texts of books and scripts, and finds patterns and similar offerings. Users can then correlate that information with web recommendations, ratings, reviews and other general signs of online buzz.

Do your data science

For StoryFit, the model is the most important element, and it's the first focus of the tech team.

"We are resource-bound as a startup. But we realized early on that we were going to need a lot of data," she said, adding that she did not want data preparation to consume undue amounts of StoryFit data scientists' time.

The data scientist can focus on just the data, not the data collecting.
Monica LandersCEO, StoryFit

Landers said her team has turned to web data integration tools from to port the data to the machine learning models, so team members can concentrate on the data science part of the process.

The Web Data Integration software enables data scientists to describe the type of data they want and to retrieve it in a suitable database format, she said.

"The data scientist can focus on just the data, not the data collecting," Landers said. "We get logs, tracking and data structure, and we can get immediately to work."

Up from site scraping

Tooling for retrieving web data has been around almost from the beginning of the web. But advances continue. Founded in 2012,'s original goal was to extract data from public websites, according to Gary Read, CEO of the vendor, based in Los Gatos, Calif., and tooling related to that task has expanded over time. entered a world in which a variety of so-called site scraping utilities and tools were available. Along the way, the company has added a number of enhancements for data preparation and management to the Web Data Integration platform.

"Companies are starting to rely on data they get from the web. It can be mission-critical. But, sometimes, the data quality becomes very poor. There is a huge multitude of site data variety, and the sites are always changing," he said.

On Jan. 29, further enhanced the platform with a data quality metrics dashboard, speedier extraction processes and other automated capabilities. Such traits position the tools to be used more in AI and machine learning applications, which thrive on diverse web data, but which can be stymied when fed flawed data sets.

Unique challenges

Back on the model-making side, StoryFit's Landers said the startup faces other challenges beyond choosing whether to build machine learning models or prepare data. Finding the balance between the familiar and the unique in entertainment analysis is one of them.

It's important to leave room for art in the screenplay development process, she said, when asked if the AI algorithms might generate uninspiring sequels to former successes, as many Hollywood human teams often do.

"We have to be thoughtful when we are applying machine learning and not replicate or reinforce elements that are repeating either past mistakes or past successes," she said.

StoryFit analysis has found it's important both to uncover story elements that make the audience feel comfortable and to create story elements that are unique, Landers said. Familiarity seems to have its place in the screen arts, she said, but making it new continues to be the path to successful adaptations.

Dig Deeper on Data integration

Business Analytics
Content Management