Your needs are going to vary a lot based on your application, dataset, and queries, so we are reluctant to make specific claims about what a given project needs. However, we can present some guidelines, plus additional things to consider.

The short answer

The most important thing is having enough RAM.

You should choose a plan that has at least enough RAM to load your entire dataset, plus room for the operating system. You should allocate at least 1GB for the OS, and then you should conservatively estimate the size of your dataset (that is, estimate high).

In general, a node consumes 15 bytes, a relationship 34 bytes, and properties 64 bytes. A little math will get you a rough estimate of how much storage you need currently.

If you expect significant growth, consider having 2-3x RAM as the size of your dataset.

The longer answer

It depends. Some customers sit below our recommendations and it works fine, particularly if they're less concerned about realtime querying and reads & writes per second. It depends entirely on the kinds of queries you run. If you're running more complex queries that will use more RAM and CPU, that will increase your RAM and CPU needs, and can cause major problems on smaller instances (below 4GB RAM).

Also remember to optimize your queries. This is the most common problem we see with performance: queries that don't use indexes and constraints, and result in unnecessary full graph scans. You might be able to get away with this on a very small dataset, but as it grows, you can make Neo4j entirely unresponsive as it chugs away on a non-optimized query. Use EXPLAIN and PROFILE in your Cypher liberally to make sure you're not doing unnecessary scans.

You can always upgrade. Make an estimate on what you'll need, and ask us if it makes sense. Then go for it. Test and benchmark before you put it into production. If you end up needing more hardware, you can buy it anytime, and we can help you make the upgrade to a cluster if needed.

Did this answer your question?