Democratizing big data: How Brainspace can create citizen data scientists in less than a day

This content is provided by Cyxtera

As federal departments and agencies rush to catch up with the data train, figuring out how to collect and store as much data as they can, they need to be thinking about what comes next: how to use it. The amount of data agencies and organizations collect is growing exponentially. The complexity is growing too. Data is pulled from websites, news articles, social media, text messaging, emails and mobile apps in addition to that generated internally by organizations. Every platform is an opportunity to collect more data. As a result, federal departments and agencies are drowning in a sea of data yet thirsting for knowledge.

Trying to discern knowledge from vast amounts of data possessed by the government is complicated by the fact that data scientists are relatively few, are highly valued in both the public and private sectors, and are the subjects of fierce competition for their services. By not having data scientists on staff, departments and agencies are often left in the untenable position of having vast amounts of data yet too few ways to take advantage of it.

“We’re trying to meet that challenge,” Michael Griffin, director of marketing at Brainspace, a Cyxtera company focused on artificial intelligence, said. “We’re trying to solve the problem of both data growth and data complexity, because we recognize that data is an organization’s biggest asset. I think for the government, it’s quickly becoming its most valuable resource for our citizens, if you think about how you would use that data proactively, or even reacting to an event. We have built an artificial intelligence solution that can accelerate the process of identifying insights in your data. We want you to make more informed decisions faster.”

The problem for most organizations is sifting through massive amounts of data to figure out what’s relevant. Most organizations use keyword indexes that let users review information by starting with what’s known. But this approach only takes an analyst so deep into the data; it’s easy to overlook something relevant if you don’t know it’s there.

Brainspace digs through a million documents an hour. It literally reads them, pulls out rich terms, phrases, and data like names, places, social security numbers and birthdays. Brainspace then plots them in visualizations that show the relationships between these data points.

“One of the things that Brainspace does exceptionally well is uncover the things that they didn’t know,” said Steve Rapp, director of sales at Brainspace. “We focus a lot more on concept search, and this capability to really uncover ideas, concepts and issues further upstream in the process that allow the analyst and the end-user to make better decisions.”

For example, Rapp said a federal task force used Brainspace on a 12 million document research project. Brainspace was able to reduce the task force’s data by 40 percent, and uncovered data that the proprietary platform had overlooked.

Another client Rapp worked with used Brainspace to scrape tweets about the company. Doing this, it was able to discover a cluster of negative sentiments originating in Yemen that it hadn’t noticed before, stemming from its work in Saudi Arabia.

A Justice Department civilian agency uses Brainspace to sort through the petabyte of information it takes in every year searching for roughly 3 percent of data the agency actually finds most relevant.

Quickly discovering knowledge from vast amounts of unstructured data without having to invest in costly and lengthy training or building expensive ontologies and architectures helps differentiates Brainspace from the other data discovery capabilities.

“We can accelerate the process of finding the signal in the noise,” Griffin said. “We know there’s a lot of noise in the data. Really what you’re looking for is something very small. You’re looking for who knew what and when, and most of the time that’s less than 1% of the collected information that you’re analyzing. So what we’ve done is develop a tool that lets you get to that signal as quickly as possible.”

The other area that makes Brainspace stand out is its focus on user experience. Algorithms and machine learning are very complex systems, and most artificial intelligence companies put all of their effort into those core analytics capabilities. Brainspace went another route, focusing on connecting the machine with the human operating it by layering a user interface over the top. The whole system was designed with the layman analyst or investigator in mind.

“That person doesn’t have to be a data scientist,” Rapp said. “It could be a soldier in a field who grabs a hard drive, pulls that into Brainspace, and starts understanding what’s presented to him.”

Users don’t have to interact with the data directly, but instead can work with what the system presents to them in the form of easy-to-understand data visualizations and machine learning. That democratizes the data across an organization, lowering barriers to access while increasing potential use cases.

“It’s such a tremendous challenge to harness insights from that data as it continues to grow. It’s a moving target,” Griffin said. “Today’s data collected by the government continues to grow. It’s getting more diverse and how do you make sense of it? How do you turn it into actual intelligence? How do you use it to your advantage? How do you make it an asset in your organization? That’s exactly what Brainspace does. Regardless of the size of data, we’re going to accelerate the process of helping you find insights from that data so that you can execute your mission faster, more accurately, and at lower cost. Brainspace can help make any user a citizen data scientist in less than a day.”


More from WTOP

Log in to your WTOP account for notifications and alerts customized for you.

Sign up