8 min

Security Datalake is taking on an increasingly central role in SentinelOne’s offerings. XDR and the new Purple AI (a generative AI addition) make the most of the data in the datalake.

The days of traditional EPP (endpoint protection), with virus definitions and signatures, are now pretty far behind us. EPP is still there, of course, and is still important. However, first EDR (endpoint detection and response) and now XDR (extended detection and response) have succeeded and to some extent superseded it. This development indicates that cybersecurity is increasingly being looked at from an overarching perspective. From protecting endpoints (EPP), via detection and threat hunting on endpoints (EDR), to detection and response for endpoints and everything else in an environment (XDR).

SentinelOne has played an important role in the development of EDR and XDR since its inception. However, the company’s ambitions do not stop at developing an XDR platform, we hear from Sjoerd de Jong, Solution Engineer at the company. SentinelOne continously adds features to its platform. Generative AI is a big recent one. Read this article to learn all about that, and the rest of the SentinelOne platform moving into 2024.

SentinelOne Security Datalake

With an XDR solution from a vendor such as SentinelOne, an organization undoubtedly has one of the most advanced security tools on the market today. Thanks to AI/ML, it is possible to detect threats relatively quickly, and with the platform’s remediation and rollback technology, organizations can also recover quickly if something goes wrong. We wrote an extensive overview piece on SentinelOne’s solution several years ago. Be sure to read that article as well for more background information.

However, the world does not stop at SentinelOne. “We know we won’t end up in a greenfield environment,” in De Jong’s words. Hence, SentinelOne had to look beyond its own product. Every organization has a lot of data that can be important from a security point of view. However, not all of that data is analyzed by security tooling. This mainly concerns data from other security solutions that organizations already have running. That kind of data needs to be aggregated one level higher. That is what SentinelOne wants to achieve with Security Datalake.

The idea of a security datalake is exactly what you might expect based on the name. This is where all the data important for security purposes should come together. Without such a datalake, a platform approach to security is hopeless, De Jong points out. In his words: “The first task is to make all available data useful and correlate it.” You do that by ingesting this data into a platform such as Security Datalake. Mind you, this does not mean that the rest of SentinelOne’s offerings are no longer important, De Jong immediately adds. “XDR and our datalake are inextricably linked, these combine a place where data ends up with action taken based on the analysis of that data.”

Aggregation alone is not enough

On paper, a security datalake sounds like a good idea. SentinelOne isn’t the only one doing it either; we even see datalake companies like Snowflake doing similar things. At the end of the day, however, all that data has to bring results. Everyone can collect data, but we also need to be able to do something useful with it. Many organizations have stumbled on that stone before, for example by getting into a SIEM platform a little too enthusiastically. That should not happen with a security datalake.

In order to make optimal use of the data that ends up in Security Datalake, SentinelOne of course has the aforementioned integration with its own XDR offering. Before that integration can be used optimally, however, there are still some steps to go through within Security Datalake, De Jong points out. To that end, SentinelOne made a targeted acquisition in early 2021, of Scalyr. This enables the import, high-speed processing and normalization of large amounts of data.

De Jong specifically points to normalization at this point. SentinelOne has parsers running for most security solutions. These normalize ingested data to the OCSF format, the standard in the security industry. This normalization is important because “not every security vendor calls a computer a computer, sometimes they call it host or endpoint,” De Jong gives as an example. Normalizing this allows it to be offered as a single set of data to SentinelOne’s XDR platform. It also allows you to easily get specific perspectives on your environment. Among other things, it allows you to see things from the endpoint’s perspective. This is important if you’re doing forensic analysis, for example. You sometimes see slightly different things from one point of view than from another.

Ultimately, a datalake is only useful if it also contains the right data. This means that it must be possible to ingest all relevant data. Aren’t there any limitations to that? According to De Jong, in principle there aren’t. You can do the integrations through the Marketplace, a kind of app store. Cloud-to-cloud integrations work pretty standard via APIs. For everything that is not cloud, SentinelOne has alternative methods. This goes as far as a collector coming to retrieve the data itself. “Even data from the oldest environments can be ingested,” he promises. “We’ve already seen most of the use cases,” he adds. So he dares make that statement.

From symptom to cause

Ultimately, the availability of SentinelOne Security Datalake, combined with the Marketplace and XDR, should make analysts’ work considerably easier. These can now immediately see all relevant context, is the idea. “From EDR or EPP, you often know that a notification is a symptom of something bigger, or that it stands alone,” according to De Jong. That in itself is not the problem. Then what is? “The question analysts want answered is whether the fire has been put out,” he points out. That question cannot be answered unequivocally with the aforementioned tooling. “It could be that it’s a one-time incident on an endpoint, but XDR data can also show whether it’s a compromised identity, for example,” De Jong explains.

If you set up Security Datalake optimally and link it to XDR, you can answer the above questions, we gather from De Jong’s words. As an example, he mentions an integration with Proofpoint, one of the major players in the field of email security. “Through this integration you know exactly what has happened down to the individual email,” according to him. The promise is that the integration only gives you relevant information. In this example, SentinelOne’s platform does the correlation and makes the connections. Proofpoint basically just provides the raw data, which goes through a parser and flows into the datalake.

The above sounds good, but is very reactive. Does it work the other way as well? “Bidirectionality is definitely the goal eventually,” De Jong indicates. Of course, that makes it more complex. However, there are already response integrations available. As an example, he mentions the one with Palo Alto Networks. Thanks to that integration, SentinelOne can send URLs and files back to Palo’s products and platforms. That information can then be used by that vendor’s firewalls to improve their protection. This makes for more effective and efficient protection. “You protect the chain with this,” De Jong summarizes. That’s very important, because no security vendor can do it alone, even though some may want you to believe otherwise.

Purple AI, a generative security assistant

At this point in the conversation, we note that while this all sounds good, it also doesn’t seem to make things easier. It sounds like there is only more information and notifications coming to SOCs and security teams. Those teams are already swamped with alerts. SentinelOne realizes that too. Hence, the company has a – how could it be otherwise in 2023/2024 – generative AI solution. This should make the lives of front-line specialists significantly better, we hear from De Jong.

Purple AI – that’s the name of the new solution/service – is part of Security Datalake. Any Security Datalake customer can access Purple AI. SentinelOne developed the model behind it itself. This was actually not a huge job at all, De Jong points out. “The principle behind it had been in our agent for a long time,” according to him. He does immediately add that the ML and AI that SentinelOne has had in the agent since 2013 is not the same as generative AI. But there are obviously concepts that both technologies share. That makes it easier for a company like SentinelOne to take this step than for a player that has no experience in this area.

Purple AI should make security easier

Ultimately, Purple AI removes a lot of complexity, is the idea. It has to, because “for most people, queries are still too complex and difficult,” De Jong points out. With that in mind, it is very difficult to send the right query to Security Datalake (and thus to XDR).

With Purple AI, you can basically answer all security queries. Purple AI interprets the question, in whatever language you ask it. It answers with relevant data, interprets the data and extracts highlights from it in normal language, suggests action and suggests additional deeper questions.

You can ask to show all devices vulnerable to a specific attack, but also ask much more general questions. You can use it as a kind of security assistant, so to speak, by asking how the organization’s security posture is. Thanks to its integration with the datalake, in which all data is normalized, Purple AI can also immediately indicate why something is not right and give recommendations. If it is still not completely clear, you can ask if Purple AI can specify it a little better.

How reliable is Purple AI?

AI can make many claims. However, organizations need to trust these claims. Especially when it comes to organizational security, this is crucial. According to De Jong, SentinelOne has built in measures for that. You can basically monitor and query everything from Purple AI, including for auditing purposes. For example, you can view the raw data and the Datalake query generated by Purple AI. There are clear buttons for that in the interface.

This type of explainability is very important to build and maintain trust in AI in these environments. SentinelOne has offered automated playbooks for its own environments since its inception. Thanks to integrations with other vendors, it can now do so beyond its own environment. The functionality will only increase toward the future as more and more (bidirectional) integrations become available. We’re convinced that will happen across the board. It is the direction we need to take to adequately secure increasingly complex environments.

Conclusion: SentinelOne takes big steps toward platform approach

SentinelOne is not solely an XDR player anymore. It still is, of course, but the company has now taken the necessary steps toward the future. As far as we are concerned, that future is undoubtedly one of security platforms. With the linking of XDR to Security Datalake, SentinelOne is one of them. Such a platform also offers many opportunities for further expansion. If the data in such a datalake is correct, it can be used for all kinds of purposes. Especially with the advent of Purple AI, it gets easier to glean insights from this data. However, trust is a crucial part of this equation. Once that is in place, the added value of security platforms can become even greater, especially if they also optimally integrate with each other.

Also read: Cybersecurity in 2023: Is it five to or five past twelve?