Skip to content

Observations from different product types #14

@jgomezdans

Description

@jgomezdans

@TonioF wrote

whilst interfacing the Observations class ( see here:
https://github.com/multiply-org/multiply-core/tree/master/multiply_core/observations
) and trying to use it in a wrapper for kafka (
https://github.com/multiply-org/inference-engine/blob/master/multiply_inference_engine/inference_engine.py
) I have come across an issue: How would we ideally deal with the case
that kafka requires input from different product types? Should these be
handled by several obervations objects, one for each product type? Or
should all products be provided by a single obervations object? If so,
shall they be ordered temporally? How would we make the distinction
whether observations come from a specific product type (the simple
approach would be to give back this info as a string)? How would kafka
deal with that it receives different types of observations?

Well, as things stand now, I assume that you will have one "class" per observation stream. Consider the case where you have S1 and S2, for which currently there are some observation classes available (see here, it's not like we're starting from 0). Ideally, the approach would be very simple

S1_obs = S1Observations(blah blah blah) # Define the S1 observations
S2_obs = Sentinel2Observations(blah blah blah) # Define the S2 observations
all_observations = S1_obs + S2_obs # Combine into a single object

all_observations only requires to provide a get_band_data method and a dates attribute (from memory ;D) which returns the observations from the different streams for the queried date. KaFKA will loop over them (in theory, the order doesn't alter the product, but YMMV). So if on a given time step there are S1 and S2 observations, it will do assimilate one and then go on and assimilate the other, before advancing the state. This procedure would be identical to having two or more S2 (or S1) observations on the same day.

Each observation class returns a set of things when queried about the data for each time step (the data, a mask, some relevant metadata, some uncertainty representation, a suitable emulator(s), ... There's an overview of this here).

So basically, and as it is, the interface is observation type agnostic, because the particular properties of every sensor are covered by the emulator, the uncertainty, the mask etc. KaFKA doesn't need to know about where the observation comes from (it's useful for debugging, but not for the functionality).

Interestingly, there are some bits of information missing for each observation stream. This is because I'm hesitant to come up with some fancy structure on a problem we don't know how to solve yet. The definition of the I/O is left light and simple for a good reason: it's simple to add new observation types, and to modify things, which will inevitably happen as we understand more about how everything works together.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions