License integration MediaWiki/Current structure on Commons

Unfortunately right now both author and license information is not stored in a structured way that would allow fetching it from the MediaWiki API.

In the case of Wikimedia Commons (commons.wikimedia.org, the media repository for Wikipedia) there is a somewhat structured way to extract it from the generated HTML.

File:Example.svg (public-domain file)
.. Description Example.svg   English: Image sample for example, in SVG   Français : Échantillon d'image pour exemple, en SVG   .. Date 10 July 2006 .. Source Own work .. Author <a href="/wiki/User:Nethac_DIU" title="User:Nethac DIU" class="mw-redirect">Nethac DIU</a> ..

File:Bustaxi.jpg (Creative Commons file)
.. <td id="fileinfotpl_desc" class="fileinfo-paramfield">Description Bustaxi.jpg English: A taxi-bus is used on bus lines with little traffic; here shown next to a 'normal' bus. Assen, the Netherlands. Deutsch: Ein Taxi-Bus wird auf Bus-Linien mit wenig Verkehr verwendet; hier neben einem „normalen“ Bus in Assen, Niederlande .. <td id="fileinfotpl_date" class="fileinfo-paramfield">Date <time class="dtstart" datetime="2004-07">July 2004 .. <td id="fileinfotpl_src" class="fileinfo-paramfield">Source Own work .. <td id="fileinfotpl_aut" class="fileinfo-paramfield">Author Photograph: <a href="/wiki/User:Andre_Engels" title="User:Andre Engels">Andre Engels</a> Own picture from <a href="/wiki/User:Andre_Engels" title="User:Andre Engels">Andre Engels</a>. .. conversion to easily extract the information that matters (ignoring any extra styling, presentational elements etc.).

The Stockphoto gadget on Wikimedia Commons is specifically designed to extract this information to allow users to easily get a boilerplate of code to re-use an image honouring the license and attribution requirements.

A small sample:

Standalone
If you're running a web service or some other server-side script that needs this information, you'll have to extract the HTML from the API (or wikitext, and pass it to  to get the HTML, then interpret that HTML and look for the elements manually (either with very creative use of regexing, substring searching or a fabulous DOM library that can handle it). Then take it and strip the tags to extract the plain text value.

Use the above examples to know which elements to look for.

Can use Toolserver
If you're OK with using a third-party service to do it for you, you can use Magnus' Commons API (source code). Magnus has walked the brave path described above in the "Standalone" section and made it available for others to use.

Alternatively, you could use that code as a base and still do it standalone.