Deutsch
 
Hilfe Datenschutzhinweis Impressum
  DetailsucheBrowse

Datensatz

DATENSATZ AKTIONENEXPORT

Freigegeben

Poster

DataJoint: Managing Big Scientific Data Using Matlab or Python

MPG-Autoren
/persons/resource/persons83896

Ecker,  A
Research Group Computational Vision and Neuroscience, Max Planck Institute for Biological Cybernetics, Max Planck Society;
Max Planck Institute for Biological Cybernetics, Max Planck Society;
Department Physiology of Cognitive Processes, Max Planck Institute for Biological Cybernetics, Max Planck Society;

/persons/resource/persons84226

Sinz,  F
Max Planck Institute for Biological Cybernetics, Max Planck Society;
Research Group Computational Vision and Neuroscience, Max Planck Institute for Biological Cybernetics, Max Planck Society;

/persons/resource/persons83801

Berens,  P
Research Group Computational Vision and Neuroscience, Max Planck Institute for Biological Cybernetics, Max Planck Society;
Max Planck Institute for Biological Cybernetics, Max Planck Society;

/persons/resource/persons84260

Tolias,  AS
Max Planck Institute for Biological Cybernetics, Max Planck Society;
Department Physiology of Cognitive Processes, Max Planck Institute for Biological Cybernetics, Max Planck Society;

Externe Ressourcen

Link
(beliebiger Volltext)

Volltexte (beschränkter Zugriff)
Für Ihren IP-Bereich sind aktuell keine Volltexte freigegeben.
Volltexte (frei zugänglich)
Es sind keine frei zugänglichen Volltexte in PuRe verfügbar
Ergänzendes Material (frei zugänglich)
Es sind keine frei zugänglichen Ergänzenden Materialien verfügbar
Zitation

Reimer, J., Yatsenko, D., Ecker, A., Walker, E., Sinz, F., Berens, P., et al. (2016). DataJoint: Managing Big Scientific Data Using Matlab or Python. Poster presented at AREADNE 2016: Research in Encoding And Decoding of Neural Ensembles, Santorini, Greece.


Zitierlink: https://hdl.handle.net/21.11116/0000-0000-7B76-2
Zusammenfassung
The rise of big data in modern research poses serious challenges for data management: Large and intricate datasets from diverse instrumentation must be precisely aligned, annotated, and organized in a flexible way that allows swift exploration and analysis. Data management should guarantee consistency of intermediate results in subsequent multi-step processing pipelines such that changes in one part automatically propagate to all downstream results. Finally, data organization should be self-documenting to ensure that lab members and collaborators can access the data with minimal effort, even years after it was collected. While high levels of data integrity are expected, research teams have diverse backgrounds, are geographically dispersed, and rarely possess a primary interest in data science. While the challenges associated with large, complex data sets may be new to biologists, they have been addressed quite successfully in other contexts by relational databases, which provide a principled approach for effective data management. DataJoint is an open-source framework that provides a clean implementation of core concepts of the relational data model to facilitate multi-user access, effcient queries, distributed computing, and cascading dependencies across multiple data modalities. Critically, while DataJoint relies on an established relational database management system (MySQL) as its backend, data access and manipulation are performed transparently in the commonly-used languages MATLAB or Python, and DataJoint can be integrated into new and existing analyses written in these languages with minimal effort or additional training. DataJoint is not limited to particular file formats, acquisition systems, or data modalities and can be quickly adapted to new experimental designs. DataJoint and related resources are available at http://datajoint.github.com.