User:Zexi.gong721/Final Report

Project Final Report: Wikidocumentaries to import images from the web to Structured Data on Commons[edit]

Project Overview[edit]

The objective of this project was to establish an efficient workflow that enables users to retrieve media relevant to a currently viewed topic in Wikidocumentaries from a designated media repository and upload it to Wikimedia Commons while incorporating structured data statements. The project aimed to address various challenges in authentication, media upload, and structured data generation.

Mentors: Susanna Ånäs, Tuukka Hastrup

Project Content[edit]

Authentication[edit]

The authentication component was a pivotal aspect of the project, as it allowed users to securely interact with Wikimedia Commons. Until now, the OAuth 2.0 authentication mechanism was successfully integrated into the project. The authentication aspect of the project was addressed by introducing a visual indicator in the toolbar that reflects the user's status.

The authentication aspect of the project was addressed by introducing a visual indicator in the toolbar that reflects the user's status.

Clicking on this icon triggers a dropdown menu that provides options for logging in or logging out, along with displaying the username if the user is logged in.

Upon clicking the login option, users are redirected to the Wikimedia Commons authentication page. After successful login, users are automatically redirected back to the previous page in Wikidocumentaries, creating a cohesive and uninterrupted experience.

Upload[edit]

The upload functionality was essential for allowing users to directly contribute media to Wikimedia Commons through Wikidocumentaries. This component aimed to implement the upload process and associate relevant metadata with the uploaded media.

The upload component encountered challenges due to outdated and oversimplified examples in the upload API documentation. The discrepancy between the documentation and the actual process complicated the implementation, and we had to invest significant effort into structuring the API requests correctly. Additionally, the error messages received during the upload process lacked specificity, causing us to struggle with pinpointing and resolving issues. For example, the permission denial error appeared frequently without clear indications of which permissions were lacking. Developers have to test many times or communicate with the API developers in order to solve this error.

The upload component of the project focused on optimizing the process of contributing media to Wikimedia Commons. A significant enhancement was the introduction of an interactive popup that displays essential image information, including title, description, copyright license, category, creator, and date. This popup is triggered when users click the action menu for an image in the image grid, and then click the upload.

This information is invaluable for users to validate and verify before proceeding with the upload process. To complete the upload, users can click the designated "Upload" button within the popup. Behind the scenes, the image information is parsed into Wiki text format, ensuring compatibility with Wikimedia Commons' requirements. The parsed information is then combined with the uploaded image and submitted to Wikimedia Commons through the API.

If the license information from the source of this image indicates that this image is not eligible for upload, the upload button will be disabled and show a message that tells the user about the license constrains.

The integration of this popup interface, along with the seamless conversion of image metadata into Wikitext, was achieved through a combination of user interface design and backend logic. This enhancement simplifies and accelerates the media contribution process while maintaining compliance with Wikimedia Commons standards.

Structured Data on Commons (SDC)[edit]

The structured data aspect of the project focused on generating meaningful structured data statements for media files uploaded to Wikimedia Commons. These statements would enhance the contextual information available to users exploring the media.

After the successful upload of an image, an important enhancement was implemented to automatically add the image's category (depict) as a structured data statement for that image.

The integration of this functionality involved collaboration between the uploaded image's metadata, the selected category, and the Wikimedia Commons' structured data system. This feature enhances the overall quality and accessibility of media files on Wikimedia Commons, aligning with the platform's mission to provide accurate and informative content.

Links of code[edit]

https://github.com/Wikidocumentaries/wikidocumentaries-ui/pull/108

https://github.com/Wikidocumentaries/wikidocumentaries-api/pull/31

Summary[edit]

In summary, the project successfully streamlined the media contribution workflow for Wikidocumentaries by addressing challenges in authentication, media upload, and structured data generation. The integration of OAuth 2.0 authentication facilitated secure interactions, while the upload functionality simplified media contribution. The generation of structured data statements enhanced the value of media files on Wikimedia Commons.

Future steps[edit]

With the core aspects of the project now essentially completed, there remain a few openings for questions that will need to be addressed in the future. These potential avenues for further enhancement and refinement could significantly contribute to bolstering the project's overall effectiveness and elevating the user experience.

OAuth Token Renewal: A potential future improvement would involve addressing the issue of OAuth tokens expiring. Currently, users need to manually refresh their tokens. Implementing an automated token renewal mechanism could streamline the user experience, ensuring uninterrupted authentication for prolonged user sessions.
Handling Special Characters: The project encountered challenges related to special characters in image filenames. To provide a seamless user experience, it's advisable to implement a mechanism that handles special characters and encodes them appropriately during the upload process. This would prevent potential errors and inconsistencies in image handling.
Popup Structure Enhancement: One observed issue is that the current popup structure sometimes causes the "upload finish" UI to become invisible, effectively trapping users within the upload process. To rectify this, the popup structure could be refined to ensure a clear and visible path for users to complete their uploads and navigate through different stages without confusion.
Error Handling and Feedback: Enhancing error messages and providing more comprehensive feedback in cases of server errors or unsuccessful actions could greatly improve user understanding and troubleshooting. Clearer error messages would aid users in identifying and addressing issues during their interactions with the application.
Continuous Documentation: Maintaining up-to-date and comprehensive documentation is crucial for the project's longevity. Continuously updating the documentation to reflect changes, improvements, and troubleshooting tips would be immensely beneficial for both contributors and users.

Incorporating these future work suggestions would not only enhance the project's functionality but also contribute to an improved user experience and a more seamless workflow for contributors and users alike.

Sequence diagram of the upload process[edit]

@startuml

actor User as Foo1

Foo1 -> Frontend : click upload button

Frontend -> Backend: Oauth2 token

Backend-> CommonsAPI: get csrf token request with Oauth2 token

CommonsAPI-> Backend: csrf token

Backend-> Frontend : csrf token

Frontend -> Backend: finna Id

Backend-> FinnaAPI: download image with finna Id

FinnaAPI -> Backend: image data

Backend-> Frontend : download success

Frontend -> Backend: tokens, finna Id and image text

Backend-> FinnaAPI: check image info

FinnaAPI -> Backend: image data

Backend-> CommonsAPI: upload image

CommonsAPI-> Backend: upload response, image page title on commons

Backend-> Frontend : upload response, image page title

Frontend -> Backend: tokens, depict Id and image page title

Backend-> CommonsAPI: image page title

CommonsAPI -> Backend: image page Id

Backend-> CommonsAPI: depict with image page Id

CommonsAPI-> Backend: depict response

Backend-> Frontend : depict response

Frontend -> Foo1 : final response

@enduml

http://www.plantuml.com/plantuml/png/XPF1JiCm38RlVGe_m3Z0SGSqG9gq4rnu09DuMzGqBecZyVXqMSeyr6Wzz11_l_txRxf9Wb7ouBiEZN24o_EPA08cs38_Tjtv3G_Fi8qSX8A5DHZlJ0zvz8mMlh88XwwRwsVs5KrFmQxX7RCSdq2ufmcfnnFmniF08RryxDPm806Julv2GQJlJ4dWvEJuJtzLwHbLgoTekmfekTox411szaP_FYl-B9z2sTGDUA3YIobcJZUif4N1HKkgPyh6K4eDOyLqMlsD0mop3Q4VMVPhZcJwIhcmv_iU6nycZzUrM-5N9b8QfoAjQjknM7JY2JY0kKWRMTnjSISSIwCpejFHuiPo2zorgiTeS6TRtOg7UcTl1OtSgc1UUZ6Q1Ke7KNxCSddHAsGpbVFWVm40