Three Ways to Scale your Graph

Estimated reading time: 10 minutes

As businesses grow and their data needs increase, they often face the challenge of scaling their database systems to keep up with the increasing demand.

What happens when your single server machine is no longer sufficient to store your graph that has grown too large? Or when your instance can no longer cope with the increasing amount of user requests coming in?

Read more
More info...

Graph and Entity Resolution Against Cyber Fraud

Estimated reading time: 4 minutes

With the growing prevalence of the internet in our daily lives, the risks of malware, ransomware, and other cyber fraud are rising. The digital nature of these attacks makes it very easy for fraudsters to scale by creating thousands of accounts, so even if one is identified, they can continue their attacks.
In this blog post, we will discuss how graph and entity resolution (ER) can help us battle these risks across different industries such as healthcare, finance, and e-commerce (for example, the US healthcare system alone can save $300 billion a year with entity resolution). You will also receive hands-on experience with entity resolution on ArangoDB.

Read more
More info...

Combat Fraud with Graph

Estimated reading time: 5 minutes

Fraud is one of the most significant issues facing businesses today. While companies have always faced fraud, detecting fraudulent activity has become even more challenging due to increased online transactions. Globally, fraud results in more than $3.7 trillion in annual losses (Murphy, 2022). Fraud comes in numerous forms, including but not limited to money laundering, identity theft, account takeover, and payment fraud. Due to the variety of ways companies can face fraud, they must have a system to protect themselves and their customers.

Read more
More info...

Why Should You Care About SOC 2?

And by the way, ArangoDB is SOC 2 compliant!

Estimated reading time: 3 minutes

first image

While driving along California’s Highway 101 and its billboards, compliance and SOC 2 seem to be an omnipresent – yet challenging – topic. But is it really? And if so, why? In this blog post, we want to share why and how ArangoDB has become SOC 2 compliant.

Read more
More info...
Apache Spark

Introducing the new ArangoDB Datasource for Apache Spark

Estimated reading time: 7 minutes

We are proud to announce the general availability of ArangoDB Datasource for Apache Spark: a new generation Spark connector for ArangoDB.

Nowadays, Apache Spark is one of the most popular analytics frameworks for large-scale data processing. It is designed to process in parallel data that is too large or complex for traditional databases, providing high performances by optimizing query execution, caching data in-memory and controlling the data distribution.

(more…)
More info...

Introducing ArangoDB 3.9 – Graph Meets Analytics

Estimated reading time: 4 minutes

We are proud to announce the GA release of ArangoDB 3.9!

Congrats to the team and community for the latest ArangoDB release 3.9! ArangoDB 3.9 is focused on extending the capabilities of advanced Analytics and especially at scaling Graph use cases even further. In the remainder of this blog post, we will dive further into some of the features including Hybrid SmartGraphs, new AQL functions, new ArangoSearch Analyzer, and various other performance and user experience improvements.

(more…)
More info...

Introducing ArangoDB 3.8 – Graph Analytics at Scale

Estimated reading time: 5 minutes

We are proud to announce the GA release of ArangoDB 3.8!

With this release, we improve many analytics use cases we have been seeing – both from our customers and open-source users – with the addition of new features such as AQL window operations, graph and Geo analytics, as well as new ArangoSearch functionality.

pasted-image-4

If you want to get your hands on ArangoDB 3.8, you can either download the Community or Enterprise Edition, pull our Docker images, or start a free trial of our managed service ArangoGraph.

As with any release, ArangoDB 3.8 comes with many improvements, bug fixes, and features. Feel free to browse through the complete feature list in the release notes to appreciate all the work which has gone into this release.

In this blog post, we want to focus on some of the highlights including AQL Window Operations, Weighted Graph Traversals, Pipeline Analyzer and Geo Support in ArangoSearch.

AQL Window Operations

The WINDOW keyword can be used for aggregations over related rows, usually preceding and / or following rows.

The WINDOW operation performs a COLLECT AGGREGATE-like operation on a set of query rows. However, whereas a COLLECT operation groups multiple query rows into a single result group, a WINDOW operation produces a result for each query row:

  • The row for which function evaluation occurs is called the current row
  • The query rows related to the current row over which function evaluation occurs comprise the window frame for the current row

There are two syntax variants for WINDOW operations:

  • Row-based (evaluated across adjacent documents)
  • Range-based (evaluated across value or duration range)

pasted-image-1

Weighted Graph Traversals

Graph traversals in ArangoDB 3.8 support a new traversal type, "weighted", which enumerates paths by increasing weights.

The cost of an edge can be read from an attribute which can be specified with the weightAttribute option.

FOR x, v, p IN 0..10 OUTBOUND "places/York" GRAPH "kShortestPathsGraph"
  OPTIONS {
    order: "weighted",
    weightAttribute: "travelTime",
    uniqueVertices: "path"
  }

As the previous traversal option bfs was deprecated, the new preferred way to start a breadth-first search from now on is with order: "bfs". The default remains depth-first search if no order is specified, but can also be explicitly requested with order: "dfs".

ArangoSearch Pipeline & AQL Analyzers

pasted-image-3

ArangoSearch added a new Analyzer type, "pipeline", for chaining effects of multiple Analyzers into one. This allows for example to combine text normalization for a case insensitive search with n-gram tokenization, or to split text at multiple delimiting characters followed by stemming.

Furthermore, the new Analyzer type "aql"is capable of running an AQL query (with some restrictions) to perform data manipulation/filtering. For example, a user can define a soundex analyzer for phonetically similar term search:

arangosh> var a = analyzers.save("soundex", "aql", { queryString: "RETURN SOUNDEX(@param)" }, ["frequency", "norm", "position"]);

Note that the query must not access the storage engine. This means no FOR loops over collections or Views, no use of the DOCUMENT() function and no graph traversals.

Enhanced Geo support in ArangoSearch

While AQL has supported Geo indexing and functions for a long time, ArangoDB 3.8 adds Geo support also to ArangoSearch with the GeoJSON and GeoPoint analyzer and respective ArangoSearch Geo functions:

  • Geo_Contains()
  • Geo_Distance()
  • Geo_In_Range()
  • Geo_Intersects()

pasted-image-2

NB: Check out the community ArangoBnB project to learn more about Geo capabilities in ArangoSearch.

Improved Replication Protocol

For collections created with ArangoDB 3.8, a new internal data format is used that allows for a very fast synchronization of differences between the leader and a follower that is trying to reconnect.

The new format used in 3.8 is based on Merkle trees, making it more efficient to pin-point the data differences between the leader and a follower that is trying to reconnect.

The algorithmic complexity of the new protocol is determined by the amount of differences between the leader and follower shard data, meaning that if there are no or very few differences, the getting-in-sync protocol will run very fast. In previous versions of ArangoDB, the complexity of the protocol was determined by the number of documents in the shard, and the protocol required a scan over all documents in the shard on both the leader and the follower to find the differences.

The new protocol is used automatically for all collections/shards created with ArangoDB 3.8. Collections/shards created with earlier versions will use the old protocol, which is still fully supported. Note that such “old” collections will only benefit from the new protocol if the collections are logically dumped and recreated/restored using arangodump and arangorestore.

Other notable features

Upgrade

Upgrading to ArangoDB 3.8 can be performed with zero downtime following the upgrade instructions for your respective deployment option. Please note our recent update advisory and update either to a newer 3.6/3.7 version or 3.8 if you are running an affected version.

ArangoGraph

The easiest way to give ArangoDB 3.8 a spin is ArangoGraph, ArangoDB’s managed service in the cloud.

Feedback

Feel free to provide any feedback either via our Slack channel or mailing list.

Special Edition Lunch Session

Join Simran Spiller on August 4th for a special Graph and Beyond Lunch Session #15.5 - Aggregating Time-Series Data with AQL.

The new WINDOW operation added to AQL in ArangoDB 3.8 allows you to compute running totals, rolling averages, and other statistical properties of your sensor, log, and other data. You can aggregate adjacent documents (or rows if you will), as well as documents in value or duration ranges with a sliding window.

In this lunch and learn session,  we will take a look at the two syntax variants of the WINDOW operation and go over a few examples queries with visual explanations.

Hear More from the Author

Graph Analytics with ArangoDB

ArangoML

Continue Reading

Introducing Developer Deployments on ArangoDB ArangoGraph

ArangoBnB: Data Preparation Case Study

C++ Memory Model: Migrating from X86 to ARM

More info...

ArangoDB 3.7 – A Big Step Forward for Multi-Model

Estimated reading time: 7 minutes

ArangoDB 3.7 GA 1 1024x538 1

When our founders realized that data models can be features, we at ArangoDB set ourselves the big goal of developing the most flexible database. With today’s GA release of ArangoDB 3.7, the project reached an important milestone on this journey.

(more…)
More info...

Azure Now Generally Available on ArangoDB Oasis

Estimated reading time: 6 minutes

ArangoDB ArangoGraph, the cloud service of ArangoDB, has reached an important milestone. We are glad to announce that Azure support is generally available in ArangoDB ArangoGraph as of today!

Currently Azure is supported in the following regions:

  • West US, Washington
  • East US, Virginia
  • Central Canada, Toronto
  • UK, London
  • West Europe, Netherlands
Read more
More info...

ArangoDB 3.4 GA
Full-text Search, GeoJSON, Streaming & More

The ability to see your data from various perspectives is the idea of a multi-model database. Having the freedom to combine these perspectives into a single query is the idea behind native multi-model in ArangoDB. Extending this freedom is the main thought behind the release of ArangoDB 3.4.

We’re always excited to put a new version of ArangoDB out there, but this time it’s something special. This new release includes two huge features: a C++ based full-text search and ranking engine called ArangoSearch; and largely extended capabilities for geospatial queries by integrating Google™ S2 Geometry Library and GeoJSON.  Read more

More info...

Get the latest tutorials,
blog posts and news: