Many scientific fields study data with an underlying structure that is a non-Euclidean space. Some examples include social networks in computational social sciences, sensor networks in communications, functional networks in brain imaging, regulatory networks in genetics, and meshed surfaces in computer graphics. In many applications, such geometric data are large and complex (in the case of social networks, on the scale of billions), and are natural targets for machine learning techniques. In particular, we would like to use deep neural networks, which have recently proven to be powerful tools for a broad range of problems from computer vision, natural language processing, and audio analysis. However, these tools have been most successful on data with an underlying Euclidean or grid-like structure, and in cases where the invariances of these structures are built into networks used to model them.
We see that the authors use the term "non-Euclidean data" to refer to data whose underlying structure is non-Euclidean.
Since Euclidean spaces are prototypically defined by R^n
(for some dimension n), 'Euclidean data' is data which is sensibly modelled as being plotted in n-dimensional linear space, for example image files (where the x and y coordinates refer to the location of each pixel, and the z coordinate refers to its colour/intensity).
However some data does not map neatly into R^n, for example, a social network modelled by a graph. You can of course embed the physical shape of a graph in 3-d space, but you will lose information such as the quality of edges, or the values associated with nodes, or the directionality of edges, and there isn't an obvious sensible way of mapping these attributes to higher dimensional Euclidean space. And depending on the specific embedding, you may introduce spurious correlations (e.g. two unconnected nodes appearing closer to each other in the embedding than to nodes they are connected to).
Methods such as Graph Neural Networks seek to adapt existing Machine Learning technologies to directly process non-Euclidean structured data as input, so that this (possibly useful) information is not lost in transforming the data into a Euclidean input as required by existing techniques.
Réponse honteusement volée depuis https://ai.stackexchange.com/questions/11226/what-is-non-euclidean-data mais je la trouve pas mal :)