OpenSim/GeospatialEngine

From OLPC
< OpenSim
Revision as of 18:11, 2 April 2008 by Bobbypowers (talk | contribs) (New page: Geospatial data comes in many flavors, the biggest distinction being between raster and vector data. If you're not familiar with it, check out the [http://en.wikipedia.org/wiki/Geographic...)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to: navigation, search

Geospatial data comes in many flavors, the biggest distinction being between raster and vector data. If you're not familiar with it, check out the Wikipedia entry on the subject. Fortunately, there are excellent open source libraries for opening and abstractly accessing the contents of maps - GDAL for raster data and OGR for vector data.

The first step towards fully integrating GIS into a simulation engine is to implement map algebra, overlay and scalar operations. Map algebra is generally the ability to do arithmetic operations on map operands, for example:

per_capita_income_map = spatial_income_map / population_map

Overlay operations are similar, but generally refer to the boolean operations. As you can probably guess by now, scalar operations are of the general form:

relative_elevation_map = city_DEM - 702

where one operand is a map and the other is a real or integer value. It is possible, and likely you would want to combine several of these operation types in a single equation:

future_city_population = city_limits_mask | (area_population_map + .07 * area_population_map) 


We can accomplish this by looping through all of the values in the raster map, or all of the features in a layer of a vector dataset. This is pretty simple, but the last example can be implemented as:

for (int i=0; i < map_size; ++i)
{
	future_city_population[i] = city_limits_mask[i] | (area_population_map[i] + .07 * area_population_map[i])
} 

We need all of the raster maps to be of the same size, and the vector maps to have the same number of , which is a common requirement. We also assume that by this point all the maps have the same reference information, or at least that the user was told their metadata says they are maps of different areas. Of course, the actual implementation is a little more complicated as we will be working through GDAL and OGR drivers, but it is not fundamentally different.

What will be really neat is to have the simulation engine discover the operand types from variable definitions instead of from explicit castings or assumptions in the equations. I suppose I am suggesting a dynamically typed engine. For example:

average_per_capita_income = income * population

Initially income and population could be defined as constants or equations of their own, and average_per_capita_income would be computed as a real value. If you then found a map of either income or population, or both, average_per_capita_income would then be computed as a map. While type information would only be stored for constant real values and variables importing maps, once a model was loaded and equations parsed a GUI could query the simulation engine as to the current types of variables, so that users would know.

Once this is done, the more interesting aspect of integrating GIS into a simulation engine is to be able to rules and models for how individual cells (or features) will behave, and I would hope to work on this after the interface, simulation engine and GIS integration, and Micropolis integration are either completed or well enough along. There is a commercial software offering that does this, Simile (see the example here). In our Sugar front-end I am confident we can give it a more intuitive look and feel, and we shouldn't have a big problem with the implementation. The biggest issues will be defining operations and safety checks for when cells examine their neighbors contents, and probably defining some helper operations that can return commonly requested information instead of having to create the functions in each cell model you make, for things like testing if the cell is one of the X cells with the highest value.

Geospatial simulation optimizations

Geospatial simulations are be notoriously computationally expensive - when before you were calculating the value of each variable per timestep, now you are doing the each calculation N times (N being rows*columns for raster data and the number of features for vector data) per timestep. This could be a real power drain on the XO. Since these simulations will be running on the simulation engine based on the LLVM JIT, we will be able to perform various compiler optimizations (LLVM supports all the optimizations GCC can do, and more!) on the simulation before we JIT compile it, ensuring that it will run as efficiently as possible on the XO. When the time comes I would love some help with profiling, as I don't (yet!) have any experience with it. I have a preliminary example of some optimizations on my wiki.