Historical Responses to Edited Form

July 06 2015

categories:

architecure

Form submissions. You know what those are. What if the form changes?

Here’s my dilemma. What if a user changes/removes a field? What should be their
experience when looking at old responses.

TL;DR There are a lot of scenarios and use-cases. You should have the flexibility to show the response with the form schema that was used to generate the response. If you store your data so that data is never orphaned or misrepresented later, you have flexibility and there’s less confusion when examining your data.

Three scenarios

1. User no longer wants a field

Best case scenario, I feel like I can hide or show old data. At this point who cares?

Conclusion

My thought is that, if the user doesn’t care, we shouldn’t show the old data. Scenario 3 address the case where the user does care.

2. User edits a field

Let’s say a user changes the “value” of some checkbox… say from “blue” to “cerulean” (A more specific color)? If I’m viewing old responses, should I update “blue” to “cerulean”… as the value… as just the label?

Initial thoughts

I feel like if a user is editing a field, they are fixing the field. Therefore, I am free to label all previous responses with the new updated value. In my years of experience, seeing similar values (e.g. “first name” and “firstname”) that remain unreconciled is an annoyance.

Concerns

The concern here is that the user treats the editing of a no-longer-desired field as a way to “efficiently” delete and create a new field. If labels are updated to reflect the change, then that could lead to confusing problems.

Conlusion

I feel like a user can learn from updates for the sake of consistent data. They would discover the updates quickly.

3. User removes a field, and adds a new field meant to replace the old field

This is the case where a user could edit but chooses not to. Or, this could arise if the user wants to update a field. Say they originally create a standard “text” field but then want to change it to a “number” field…. or even a “textarea”.

Concerns

This case is logically different than the case when a user simply removes an unwanted field, but, it is techincally the same. Let’s pretend that the user removes a desired field from a mature form (lots of responses).

What can be done?

We can convert fields

At this point, there’s no 100% reliable way to passively reconcile historical data. We could never be certain, even if the names were the same. What does converion look like? In my case, this would mean going through the repsonses and updating the old field ID to the new field ID. This, of course, would be in response to the user acting upon some exposed option to convert the field.

This seems ok. But…

What if we want to re-render the old field (perhaps to edit)? We’d need to change other properties in the old response (most notably the field type).
What if the old value doesn’t match approved new values? For example, what if we switch from a “text” field to a “select” field. Furthermore, I’m storing the answer of such fields with options as “IDs” instead of text representation. This generally makes it much harder to reconcile historic responses since we’d have to first find the ID of the matching text option. And if there is no match? Then what? If we leave the answer as-is, not we have some answers containing IDs, and other answers containing strings.

This is complicated stuff. Do-able, but complicated. If we take re-conciliation off the table, what can we do?

Do nothing

While the user did delete the field, I feel like they would be pretty upset to have access to old data blocked.

Show old fields in form responses

Currently, I’m showing the current form labels then showing the repsonse (if it exists). Old data is hidden. I could continue to do that, but also add the old label and value below. In my case, this means refactoring to store the old label with responses. Right now I just store IDs in responses.

My responses and my forms are tightly coupled. It seems like a waste to essentially store the form schema with every response (which is why I didn’t). I suppose I could store each revision of the form and then mark the revision in the response. I would be storing more forms… but it’s still fewer bytes in most cases. (e.g. 5 revisions of forms with 100 responses compared with 1 form with 100 reponses each containing a lot of details about the 1 form). Both of these options would undermine my conclusion about scenario 2 though :( I’m ok with that… I think.

Conclusions

See the next scenario :)

The user deletes a field. Later, they want to see data about that field.

This is similar to the previous case. The difference is that “conversion” wouldn’t help you here. This leaves the option of showing old fields in the form response. This helps re-inforce that showing old fields is really the first step in a robust form response archive.

Now what?

Well, now I know that being able to show old form responses is the minimum viable
solution to this problem. This, however, means I need to re-factor :|

Re-factor follow up

I solved this problem by treating each form model as a revision. The response data points to the revision. All revisions of the form share a common formId property, which is a new property. I also have a new boolean propety called latest on the most recent revision. I could do timestamp stuff, but I like that that property is there.

What principles can I take away from this?

Don’t link two sets of data in a way that one set of data can have orphaned peices of data. This leads to confusion, wasted data, and limits options.

Don’t store your data in a way that opens wide the doors for an illegitmate represenation of the original data’s intent (e.g. Checkbox value changes from “yes” to “no” because of a form field schema edit… we should be able to reconstruct the original “yes” value).