[GIS] System requirement specification for Big Data Postgesql, PostGIS, and GeoServer based service

geoserverpostgispostgresql

I want to start a GIS portal for around 3000-5000 visitors per day. MY database will be of an approximate size of 10 Terrabytes. The maps and layers are moderately complex, satellite-map overlay is supported. I am basically trying to project maps on the satellite images so people using my system can figure out the route they want to take. I will be using Geoserver, POST GIS, POSTGRE SQL for this purpose. Kindly, let me know what might be the hardware and/or software requirements to start such a project. Also kindly let me know the number and types of computing hardware that I may require for a smooth operation.
Thanks in advance.

Best Answer

We have a comparable system (a research project); we're aiming for a 20TB Postgresql database holding coverage data such as Remote Sensing imagery. We're running this on a PowerEdge R720xd Rack Server with 128GB of RAM. Using CentOS as the operating system. Our other software is MapServer, Rasdaman, Apache httpd web server, and Apache Tomcat. We haven't got any idea of the number of requests the server will handle, at this stage we're more interested in the type of processing we can handle on the coverage data.

Our data sits on a RAID 6 array, which is good enough for our purposes but for a production system, such as yours sounds like, you should almost definitely opt for RAID 10, which means you will need to have a system with at least 20TB of available disk space.

File system on our RAID array is XFS, see the following page for file system options for large data volumes: What are the file and file system size limitations for Red Hat Enterprise Linux?

You should also configure at least 16GB of swap space, see the following blog post for a discussion on amount of swap space to configure: Linux: Should You Use Twice the Amount of Ram as Swap Space?

For our system the majority of the workload will be at the database end, we may have to up the amount of RAM, but at the moment we have just taken the default amount supplied by Dell for this system. For a system without heavy database usage you could probably get away with a lot less.

For example we have a WMS server (using MapServer) with multiple services, one of which serves 45,000 GetMap Requests per week (without any tile caching). This server has 4GB RAM, Dual Xeon 2.67 GHz processors, and so far handles this load with no issue.

I understand that you would like GeoServer specifications, and I don't have any to hand, but MapServer and GeoServer performance is very similar as you can see from the FOSS4G benchmarking tests such as:

Related Question