2019-10-18 Steering Committee Minutes for Data Helix Project

Attendees

  • Alan Froggatt

  • Andrew Carr

  • Colin Eberhardt

  • Matt Richards

  • Antony Welsh

Minutes


Alan Froggatt gave a progress update

  • Work has been undertaken to more cleanly separate the parse, validation and generation logic

  • Significant simplification of the JSON profile has been done which makes it terser and more intuitive to the user but also allows easier/better validation to be done during the json reading stage rather than during the generation stage. This also makes the generation code simpler.

  • Work is in progress on a sand box UI that will be based on AWS lambda and will be hosted on GitHub

  • Work has started on designing a framework for custom generators. The idea is that users can implement java ‘generator’ interfaces that we provide and be able to inject them in at runtime and reference them in their profiles

  • Investigation of 3rd party generators (FAKER and MOCKAROO) is underway. Integration into python data generator libraries has not started but will be put on the list.

    • The plan is to integrate 3rd party libraries using the same framework as we build for custom generators

  • Aiming for a v1 of data helix for the end of the year

    • The team will need some help to improve the public-facing ‘image’ of the project

    • ACTION: Colin E to provide more guidance on what makes a good open source project

We discussed the DataHub

  • This is the project that Paul from Citi group is trying to get added to FinOS and its a collection of python libraries for data generation

  • We discussed if we should combine forces in some way and agreed that Andrew and Alan should contact Paul for an open discussion about both our projects

    • We believe we should aim to use DataHub libraries within DataHelix in a later release - plan is to do a general investigation of integration python data generation libraries into java beforehand

    • Perhaps Data Helix could enhance the dataHub, making it easier to use

 

We discussed “Updated related field constraints question” that was emailed to committee earlier in the week.

  • The development team have been debating how to simplify comparison constraints in the json schema and the code and this was an example that they sought an opinion on

  • We settled on the route that is simplest to implement! This means extra grammar but less overload of what value fields can mean in these constraints (je whether its a field name or a string value)



Need help? Email help@finos.org we'll get back to you.

Content on this page is licensed under the CC BY 4.0 license.
Code on this page is licensed under the Apache 2.0 license.