Difference between revisions of "Metadata"
(Redirected page to ZIM Metadata) Tag: New redirect |
(Removed redirect to ZIM Metadata) Tag: Removed redirect |
||
Line 1: | Line 1: | ||
# | In order to provide a description to each ZIM file that can be easily extracted we defined a special '''namespace M''' and a standardized set of keywords that should be used. | ||
Every key is defined like an article, the key name is used as the article name, the key value is put into the article text. This way also metadata is compressed, but extendable. Further keys could be used in a ZIM file without breaking the standard but please be aware that maybe the openZIM project will define additional keys in the future. Any ZIM library reading this metadata should ignore missing keys / values and just return NULL values in such cases. | |||
== Keys == | |||
{| class="sortable" style="border-width:1px; border-style:solid; border-color:#888888; background-color:#eeeeee; border-collapse:collapse; empty-cells:show" cellspacing="0" cellpadding="4" {{Prettytable}} | |||
! Key !! Mandatory !! Description !! Example | |||
|- | |||
! Name | |||
| yes | |||
| A human readable identifier for the resource. It's the same across versions (should be stable across time). | |||
| ''wikipedia_fr_football'' | |||
|- | |||
! Title | |||
| yes | |||
| title of zim file. 30 [https://en.wikipedia.org/wiki/Grapheme graphemes] maximum recommended. | |||
| ''English Wikipedia'' | |||
|- | |||
! Creator | |||
| yes | |||
| creator(s) of the ZIM file content | |||
| ''English speaking Wikipedia contributors'' | |||
|- | |||
! Publisher | |||
| yes | |||
| creator of the ZIM file itself | |||
| ''Wikipedia user Foobar'' | |||
|- | |||
! Date | |||
| yes | |||
| ZIM creation date (ISO - YYYY-MM-DD) | |||
| ''2009-11-21'' | |||
|- | |||
! Description | |||
| yes | |||
| description of content (one short sentence). 80 [https://en.wikipedia.org/wiki/Grapheme graphemes] maximum recommended. | |||
| ''All articles (without images) from the english Wikipedia'' | |||
|- | |||
! LongDescription | |||
| no | |||
| extended description of content. Carriage return allowed. {{formatnum:4000}} [https://en.wikipedia.org/wiki/Grapheme graphemes] maximum recommended. | |||
| ''This ZIM file contains all articles (without images) from the english Wikipedia by 2009-11-10. The topics are ...'' | |||
|- | |||
! Language | |||
| yes | |||
| [http://www.sil.org/iso639-3/codes.asp ISO639-3 language identifier]. If many, comma separated, and ordered by "importance" (which should be the number of entries, but in a edge case it can be ordered on an other criteria). | |||
| ''eng'' | |||
|- | |||
! License | |||
| No | |||
| License code of the content. | |||
| ''CC-BY'' | |||
|- | |||
! Tags | |||
| no | |||
| A list of [[tags]] | |||
| ''wikipedia;_category:wikipedia;_pictures:no;_videos:no;_details:yes;_ftindex:yes'' | |||
|- | |||
! Relation | |||
| no | |||
| URI of external related ressources | |||
| | |||
|- | |||
! Flavour | |||
| no | |||
| A human readable string describing the way how the content has been scraped. It's the same across versions (should be stable across time). | |||
| ''nopic'' | |||
|- | |||
! Source | |||
| no | |||
| URI of the original source | |||
| ''https://en.wikipedia.org/'' | |||
|- | |||
! Counter | |||
| no | |||
| Number of non-redirect entries per mime-type in the [[ZIM_file_format#Namespaces|C namespace]] | |||
| image/jpeg=5;image/gif=3;image/png=2;... | |||
|- | |||
! Scraper | |||
| no | |||
| Details about the software used to scrape the content, with its version | |||
| mwoffliner 1.2.3 | |||
|- | |||
!Illustration_[height]x[width]@[scale] | |||
|yes | |||
|A png image (resolution [height] by [width]) to illustrate the zim file. | |||
This must be a binary content (png) with mimeytpe `image/png`. | |||
We follow the same specification than freedesktop https://specifications.freedesktop.org/icon-theme-spec/icon-theme-spec-latest.html for the size and scale of the icon. | |||
<code>height</code>, <code>width</code>, <code>scale</code> describe the '''target''' size (where the icon is intended to be displayed) : | |||
- <code>Illustration_48x48@1</code> is a 48x48 pixels image to be displayed as a 48x48 icon on a scale 1 screen. | |||
- <code>Illustration_48x48@2</code> is a 96x96 pixels image to be displayed as a 48x48 icon on a scale 2 screen. | |||
- <code>Illustration_96x96@1</code> is a 96x96 pixels image to be displayes as a 96x96 icon on a scale 1 screen. | |||
<code>Illustration_48x48@1</code> is mandatory. Others are optional. | |||
! | |||
|} | |||
== Favicon (Old zim file) == | |||
Old zim file may have a illustration in <code>-/favicon</code> (it can be a redirection to the real content). | |||
Reader must be able to read the illustration using this path. | |||
Writer must not set a <code>-/favicon</code> | |||
== See also == | |||
* [http://dublincore.org/documents/dces/ Dublin Core] |
Revision as of 19:28, 10 June 2024
In order to provide a description to each ZIM file that can be easily extracted we defined a special namespace M and a standardized set of keywords that should be used.
Every key is defined like an article, the key name is used as the article name, the key value is put into the article text. This way also metadata is compressed, but extendable. Further keys could be used in a ZIM file without breaking the standard but please be aware that maybe the openZIM project will define additional keys in the future. Any ZIM library reading this metadata should ignore missing keys / values and just return NULL values in such cases.
Keys
Key | Mandatory | Description | Example |
---|---|---|---|
Name | yes | A human readable identifier for the resource. It's the same across versions (should be stable across time). | wikipedia_fr_football |
Title | yes | title of zim file. 30 graphemes maximum recommended. | English Wikipedia |
Creator | yes | creator(s) of the ZIM file content | English speaking Wikipedia contributors |
Publisher | yes | creator of the ZIM file itself | Wikipedia user Foobar |
Date | yes | ZIM creation date (ISO - YYYY-MM-DD) | 2009-11-21 |
Description | yes | description of content (one short sentence). 80 graphemes maximum recommended. | All articles (without images) from the english Wikipedia |
LongDescription | no | extended description of content. Carriage return allowed. 4,000 graphemes maximum recommended. | This ZIM file contains all articles (without images) from the english Wikipedia by 2009-11-10. The topics are ... |
Language | yes | ISO639-3 language identifier. If many, comma separated, and ordered by "importance" (which should be the number of entries, but in a edge case it can be ordered on an other criteria). | eng |
License | No | License code of the content. | CC-BY |
Tags | no | A list of tags | wikipedia;_category:wikipedia;_pictures:no;_videos:no;_details:yes;_ftindex:yes |
Relation | no | URI of external related ressources | |
Flavour | no | A human readable string describing the way how the content has been scraped. It's the same across versions (should be stable across time). | nopic |
Source | no | URI of the original source | https://en.wikipedia.org/ |
Counter | no | Number of non-redirect entries per mime-type in the C namespace | image/jpeg=5;image/gif=3;image/png=2;... |
Scraper | no | Details about the software used to scrape the content, with its version | mwoffliner 1.2.3 |
Illustration_[height]x[width]@[scale] | yes | A png image (resolution [height] by [width]) to illustrate the zim file.
This must be a binary content (png) with mimeytpe `image/png`. We follow the same specification than freedesktop https://specifications.freedesktop.org/icon-theme-spec/icon-theme-spec-latest.html for the size and scale of the icon.
- - -
|
Favicon (Old zim file)
Old zim file may have a illustration in -/favicon
(it can be a redirection to the real content).
Reader must be able to read the illustration using this path.
Writer must not set a -/favicon