Session Proposals

Open source design workflows in OpenRefine's development ecosystem

Who can contribute to OpenRefine's design framework? What workflows would enable not only broader participation, but also more actionable and achievable design contributions? Design practice in open source products and communities is notoriously difficult to implement in equitable and effective ways, so with this session we can discuss together how this could work in the context of OpenRefine: What can we learn from design projects carried out with OpenRefine in the past, and what could a future design contribution framework for OpenRefine look like.

Making OpenRefine more useful as an exploratory tool

It'd be great to explore options to improve basic data visualization in OpenRefine. Using visualization tools as part of facets is great IMO but the visual side could be improved, especially if we're better integrating GIS tools to OpenRefine. We could discuss the scope we're comfortable with when it comes to exploratory analyses and suggest types of visualizations/analyses we feel are lacking right now.

Bridging OpenRefine and GIS

It would be great to have improved tools to read/write common GIS formats like shapefiles. Perhaps having a closer collaboration with OSM? ArcGIS is becoming fairly open to open-source as well and they basically dominate the academic world and multiple industries. Some governments are turning to QGIS, which might be the fastest way forward for now, but it'd be interesting to hear from anyone who has specific challenges with GIS and OpenRefine.

Adding geospatial data to non-geospatial datasets can also represent a bit of a challenge for beginners when there are so many good geospatial sources out there.

Improving OpenRefine contributor pathways: Roles, Permissions, and Processes

Our ongoing discussions throughout 2023 and early 2024 (https://forum.openrefine.org/t/improving-the-onboarding-process-for-new-contributors/882) have highlighted the need for a more systematic approach to managing our contributors. This session will serve as a platform to share our ideas and discuss how we can formalize the process. We will looking into the following points:

1. Defining Contributor Roles: What are the various contributor roles we should recognize and encourage within our community? How can we ensure that each role is clear and meaningful?

2. Setting Permission Levels: What should the permission levels be for different stages of involvement (e.g., new contributor vs. long-term committer)? How can these levels help in managing community contributions more effectively?

3. Managing Permissions: What processes should we implement for granting and revoking permissions? What criteria should be used to ensure fairness and transparency in these decisions?

What does an Extension Developer role look like as it is generally outside of main OpenRefine development? Extension developers may want or need closer ties to the other contributors. We also need an overhaul for the extension tutorial.
Keven L. Ates, 01.05.2024

If only OpenRefine could be more like…

In this session, all participants are invited to bring up projects that OpenRefine should take inspiration from. This could relate to all sorts of aspects:

- user interface ("tool X feels much less clunky than OpenRefine")

- features ("I always miss this feature from tool Y when I work with OpenRefine")

- project structure ("OpenRefine should be made by a single person / only volunteers / only paid staff / a company like project X")

- documentation ("it's much easier to learn / teach tool Y because…")

- contribution workflows ("I prefer contributing to project X because…")

- any other aspect you can think of!

In a first brainstorming phase, we will gather wishes and group them area. Then for each wish we will invite the participant who expressed it to briefly explain (and possibly demonstrate onscreen, if doable) their wish.

The outcomes of this session will be documented in the minutes, which will be posted on the forum.

For documentation, we need a much needed overhaul to the extension documentation. What will this look like for a new version of OpenRefine?
Keven L. Ates, 01.05.2024

Supporting data upload to more platforms

OpenRefine currently supports data upload to Wikibase, but also to RDF databases via the RDF Transform extension or to SNAC also via an extension for instance. More such extensions (or forks) have been developed in the past, with varied maintenance status. A recent discussion with the GND community explored the idea of building a similar integration to populate the authority file of the German national library.

How can we enable more of such integrations? Are there specific integrations we want to prioritize? Should those be developed inside the OpenRefine project or outside, as extensions?

The goal of this session is to explore those questions together: mapping the ecosystem of existing and potential integrations, reflecting on what their common needs are and how OpenRefine can meet them better.

I've explored an idea for executing SPARQL queries and updates directly from OpenRefine to pull and push data. How can we make SPARQL queries easier in OpenRefine via expressions? Can we use a SPARQL query as a source, i.e., import data from a triple store using SPARQL? Can we export the data directly to a triple store via a SPARQL update?
Keven L. Ates, 01.05.2024

Strategizing our Roadmap for user needs

It would be nice to have someone go over the general roadmap of OpenRefine at the start. A final session on last day might be where we summarize the consensus and put into GitHub Project the Roadmap plan (perhaps with milestones). From my personal viewpoint, it seems we just need to get consensus on priorities which seem shifty to our users. We indeed already have some forum posts on Roadmap discussions, but for many users, it feels disconnected from "what is possible" versus "what do users need" versus the actually realized "what can we get funded to actually work on despite our users needs".

(a sub-component of this discussion is how best to present dependencies of Roadmap items to users? Many other projects in GitHub use milestone partitioning, instead of Tracks and Tracked by fields. For example: "4.0-first", "4.0-second" "4.0-extensions-first", etc. in order to know that the "first" issues need to be worked first because "second" issues have dependency on the "first".)

(another sub-component of this discussion is adding new fields to the GitHub Project tables to enhance Roadmap visibility)

I am very interested in what the 4.0 extension process / connector will look like. This needs solid documentation and examples. A comprehensive catalog of the internal hooks is needed to help guide extension developers on their availability and use. It feels like their are hidden hooks that require a deep dive into the main OpenRefine code to find them. Library dependency issues occur with conflicts between extention libs and main OpenRefine libs. Can an extension override an OpenRefine lib when certain "newer" functionality is needed by the extension? Should this be sandboxed in someway? Also, Java version guidance is / was an issue. Apparently, some end users are confused about the need to install updated Java versions to run OpenRefine.
Keven L. Ates, 01.05.2024
In order to be able to create or vote for proposals, you need to be logged in. you can log in and register here