Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use metadata field on data point as partitioning key for download #848

Closed
mtwestra opened this issue Oct 9, 2014 · 20 comments
Closed

Use metadata field on data point as partitioning key for download #848

mtwestra opened this issue Oct 9, 2014 · 20 comments

Comments

@mtwestra
Copy link
Contributor

mtwestra commented Oct 9, 2014

At the moment, when an enumerator downloads data, he/she downloads all the data on the project. However, in many cases, project managers prefer to give enumerators access only to a subset of the data.

As an example, consider the case of Ghana. Surveys are defined per region. But the monitoring will be done on district level, so the project manager prefers to give access to change data only on the district level. But at the moment only the data for the whole region can be downloaded.

One way to solve this would be to put an addition piece of meta-data on the datapoint, just like the displayName that we have now. This meta data could be filled by a question in the survey, for example 'district'. Then access to the data could be partitioned based on this meta-data field.

@iperdomo
Copy link
Contributor

iperdomo commented Oct 9, 2014

I guess that the project structure solves issue, right?. Each region could be a folder and you grant access to a district.

@ichinaski
Copy link
Contributor

Just doing some brainstorming...

Another possible approach would be to sync datapoints on demand. That is, you need to provide a datapoint ID in order to sync it (admins can just read it from the dashboard). Once a datapoint is downloaded, it will be updated in each sync cycle, until it gets unlinked. This will give field managers much tighter control over synced data.

@mtwestra
Copy link
Contributor Author

mtwestra commented Oct 9, 2014

@iperdomo @ichinaski,

The project structure won't solve it, as the region has only a single project with a single form. Doing this on a district level would mean you need to copy the survey 120 times.

doing it on datapoint id is too fine grained: a typical district here will have some 500 data points.

@ichinaski
Copy link
Contributor

Alternative idea: Display the whole list of available datapoints, and manually select which ones you want to download. This is the same behavior Dropbox and Google Drive have on a mobile app.

@mtwestra
Copy link
Contributor Author

mtwestra commented Oct 9, 2014

@ichinaski this does not make sense to me. First of all, the idea is to restrict access - not allowing enumerators to download certain data points, based on some rule. Second, the numbers are too large: a district has 500 data points, out of 3000 in a region.

@muloem
Copy link
Member

muloem commented Oct 9, 2014

This is an access control issue but on the device level if I understand correctly. Depending on when we want such a functionality available I see two options:

  1. in the case where we have device users / login, then we would control access there, i.e., when a user is trying to download data there is a check what permissions/access they have and only that data can be downloaded.
  2. in the present case, maybe we can add something in the survey assignment that determines what the device is allowed to download. For example, an assignment can allow downloading, and in addition, have a filter defining what can be downloaded based on fields in the surveyed locale.

@ichinaski
Copy link
Contributor

First of all, the idea is to restrict access - not allowing enumerators to download certain data points, based on some rule.

In my previous comments I was assuming a field manager was in charge of data sync.
In you filter datapoints based on a particular response, how do you prevent enumerators from downloading extra data?

@mtwestra
Copy link
Contributor Author

mtwestra commented Oct 9, 2014

@ichinaski Good point. Probably, we need real users on the devices, so we can assign subsets of the data to users.
@muloem yes, exactly - it could be done on device level now. But perhaps we should wait until users on the device

@muloem
Copy link
Member

muloem commented Oct 9, 2014

Agreed. If we are not losing anything major by waiting a little it makes sense to wait for device users.

@iperdomo
Copy link
Contributor

iperdomo commented Oct 9, 2014

My concern on the meta data approach is that is ad-hoc. Right now we have flags at Question level, that maps to a property in the Data point (SurveyedLocale). This relationship is not declared anywhere, but in a hard-coded logic. We're now proposing to add a 3rd property on Question that will map to another property in SurveyedLocale. If we follow this path, this design will be our "golden hammer" ... We need to find a better approach to solve this.

@mtwestra
Copy link
Contributor Author

mtwestra commented Oct 9, 2014

Another option could be to use the cascading questions - to use a level of such a question to define subsets of the data. But that would be done more or less in the same way as above. Any other options?

@mtwestra
Copy link
Contributor Author

mtwestra commented Oct 9, 2014

So the question we are trying to ask is: can we restrict access to a subset of data points of a single project, based on some property. I would think that property needs to come from the answers of the form, right?

@muloem
Copy link
Member

muloem commented Oct 9, 2014

More specifically the answers of the Registration form

@iperdomo
Copy link
Contributor

iperdomo commented Oct 9, 2014

Getting back to the project structure + user roles and permissions....

We have designed the security around roles/permissions + projects. So a project is atomic, either you have access to it or you don't.

If you have access to a project and you have access to the SYNC action, you should be able download the whole data for that project.

We're making a workaround because we don't need to make X copies of a form (where X is the numer of districts). This is due to the fact, that a Form belongs to one project. What if a form can be inherited, that is, is visible to all child projects? We can define the form at root level, and still follow the project+roles/permissions pattern for data access management. (<- just a brain dump)

@mtwestra
Copy link
Contributor Author

mtwestra commented Oct 9, 2014

@iperdomo take the case of Ghana: 10 regions, 120 districts. The ideal case would be single survey for the whole country, because now it is already very inconvenient to change anything in all the survey copies of the regions. The reason it was done per region is to limit risks + to make it possible to use dependency-based cascading questions. Having to make 120 copies of a project to achieve the same result would seem a bit much.

@iperdomo
Copy link
Contributor

iperdomo commented Oct 9, 2014

Having to make 120 copies of a project to achieve the same result would seem a bit much.

The idea is to not make a copy. But define the Form at root level and this definition can be inherited with some configuration.

@mtwestra
Copy link
Contributor Author

mtwestra commented Oct 9, 2014

But then you still would need to create 120 items of something, right?

@iperdomo
Copy link
Contributor

iperdomo commented Oct 9, 2014

Yeap, your district hierarchy definition.

@joycarpediem
Copy link
Member

What if we use assignments based on data point display name.
For example, say the current display name is "SubRegion-District-Village-HouseholdName".
In the "Data Assignment" tab,

  1. Selects the survey
  2. Selects which display name questions he wants to filter on . Say in this case "District"
  3. Selects the value for "District" that he wants synced to particular devices. Say he wants "District-1" to be synced to devices D1,D2,D3. So he selects the "District" option value and devices accordingly.
  4. While syncing , we our logic can be
    if(DisplayName.contains("District-1")){
    sync
    }

@janagombitova
Copy link
Contributor

duplicate of #2796

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants