Shortcuts: WD:PC, WD:CHAT, WD:?

Wikidata:Project chat

From Wikidata
Jump to navigation Jump to search

Wikidata project chat
A place to discuss any and all aspects of Wikidata: the project itself, policy and proposals, individual data items, technical issues, etc.

Please use {{Q}} or {{P}} the first time you mention an item or property, respectively.
Other places to find help

For realtime chat rooms about Wikidata, see Wikidata:IRC.
On this page, old discussions are archived after 7 days. An overview of all archives can be found at this page's archive index. The current archive is located at 2024/08.

Qualifier value violations?

[edit]

I fail to understand what the issue is with these Qualifier value violations. What is expected for the number of members of an organization? Judging from the fairly large number of violations something is going really wrong here, or the expected entries are counter intuitive or... I cannot tell. Anyhow, any insight on this is welcome. Cheers [[kgh]] (talk) 15:22, 22 July 2024 (UTC)[reply]

Wow, nobody with a clue? [[kgh]] (talk) 07:58, 26 July 2024 (UTC)[reply]
That constraint is an unofficial and undocumented one and I don't know what it's supposed to mean on that property (there's already allowed qualifiers constraint (Q21510851) for which qualifiers are allowed). Since it doesn't seem to be doing anything useful, I've removed it. - Nikki (talk) 09:59, 29 July 2024 (UTC)[reply]
That explains things. Thank you for your feedback and support. Cheers --[[kgh]] (talk) 12:29, 29 July 2024 (UTC)[reply]

Datatype-change proposal

[edit]

I would like to propose changing the data type of NIOSH Pocket Guide ID (P1931) to external identifier. The current data type is string. Janhrach (talk) 15:38, 22 July 2024 (UTC)[reply]

  • @Janhrach: At the time of the initial datatype conversions for external id's this property was listed as disputed due to the same id being applied to multiple items - "only 94.17% unique out of 720 uses". This should probably be cleaned up if it hasn't been in the meantime, or clarified in how the property should be used. ArthurPSmith (talk) 13:59, 23 July 2024 (UTC)[reply]
    @ArthurPSmith: The problem is that some IDs refer to groups of chemical entities, e.g. [1]. Is this a reason why P1931 couldn't have the external identifier data type? Janhrach (talk) 09:19, 27 July 2024 (UTC)[reply]
    If you want it to be an identifier then it should uniquely identify things. The ID for "Be and Be compounds" should not be applied to just Be or particular compounds, it is an identifier for the whole group. If we have an item for the whole group then the ID could be applied there. Or if you just want to have that link without it being really an identifier then string datatype is what you want. ArthurPSmith (talk) 13:03, 29 July 2024 (UTC)[reply]
    Thanks. Janhrach (talk) 08:05, 2 August 2024 (UTC)[reply]

Non-unique statement id in Q85046372

[edit]

According to Wikidata documentation: [stmt_id is] An arbitrary identifier for the Statement, which is unique across the repository.

But going to the Wikidata webpage for Secondary limb lymphedema (Q85046372) and looking in the page source, we can see that Q85046372$70E829CD-2D80-48D1-BB71-8EE2B5C22051 is referenced twice, every time with a different underlying data:

<div id="Q85046372$70E829CD-2D80-48D1-BB71-8EE2B5C22051" class="wikibase-statementview wikibase-statement-Q85046372$70E829CD-2D80-48D1-BB71-8EE2B5C22051 wb-normal">...</div>
...
<div id="Q85046372$70E829CD-2D80-48D1-BB71-8EE2B5C22051" class="wikibase-statementview wikibase-statement-Q85046372$70E829CD-2D80-48D1-BB71-8EE2B5C22051 wb-normal">...</div>

Both ids show up in cites work (P2860): Arm morbidity after sector resection and axillary dissection with or without postoperative radiotherapy in breast cancer stage I. Results from a randomised trial. Uppsala-Orebro Breast Cancer Study Group (Q73307092) and Case-control study to evaluate predictors of lymphedema after breast cancer surgery (Q37410695). 195.191.163.76 07:26, 26 July 2024 (UTC)[reply]

REST API response for the two P2860 statements with identical IDs:
curl -s https://www.wikidata.org/w/rest.php/wikibase/v0/entities/items/Q85046372 | jq '.statements.[].[] | select(.id == "Q85046372$70E829CD-2D80-48D1-BB71-8EE2B5C22051")'
JSON object from API response
{
  "id": "Q85046372$70E829CD-2D80-48D1-BB71-8EE2B5C22051",
  "rank": "normal",
  "qualifiers": [
    {
      "property": {
        "id": "P1545",
        "data-type": "string"
      },
      "value": {
        "type": "value",
        "content": "10"
      }
    },
    {
      "property": {
        "id": "P1545",
        "data-type": "string"
      },
      "value": {
        "type": "value",
        "content": "13"
      }
    }
  ],
  "references": [
    {
      "hash": "7c52980f6382f58bc9ff3831c60ec37b6e0618c0",
      "parts": [
        {
          "property": {
            "id": "P248",
            "data-type": "wikibase-item"
          },
          "value": {
            "type": "value",
            "content": "Q5188229"
          }
        },
        {
          "property": {
            "id": "P356",
            "data-type": "external-id"
          },
          "value": {
            "type": "value",
            "content": "10.1016/J.LPM.2009.06.023"
          }
        },
        {
          "property": {
            "id": "P854",
            "data-type": "url"
          },
          "value": {
            "type": "value",
            "content": "https://api.crossref.org/works/10.1016/J.LPM.2009.06.023"
          }
        },
        {
          "property": {
            "id": "P813",
            "data-type": "time"
          },
          "value": {
            "type": "value",
            "content": {
              "time": "+2024-07-15T00:00:00Z",
              "precision": 11,
              "calendarmodel": "http://www.wikidata.org/entity/Q1985727"
            }
          }
        }
      ]
    }
  ],
  "property": {
    "id": "P2860",
    "data-type": "wikibase-item"
  },
  "value": {
    "type": "value",
    "content": "Q73307092"
  }
}
{
  "id": "Q85046372$70E829CD-2D80-48D1-BB71-8EE2B5C22051",
  "rank": "normal",
  "qualifiers": [],
  "references": [],
  "property": {
    "id": "P2860",
    "data-type": "wikibase-item"
  },
  "value": {
    "type": "value",
    "content": "Q37410695"
  }
}
--Dhx1 (talk) 13:44, 29 July 2024 (UTC)[reply]
You might want to repost this at Wikidata:Report a technical problem or open a Phabricator ticket. @Mohammed Abdulai (WMDE) is always very helpful. William Graham (talk) 16:31, 29 July 2024 (UTC)[reply]
reposted it in Report a technical problem 195.191.163.76 11:52, 30 July 2024 (UTC)[reply]

What to do with an item that seems to conflate two or more people

[edit]

This item Jo Jo Smith (Q99646571) seems to conflate at least two people, a baseball player and a musician. What should one do here? StarTrekker (talk) 02:06, 29 July 2024 (UTC)[reply]

This is identical to the above: statements need to be moved from one item to another. Do you feel like you can tease out which statements and interwiki links apply to which person? —Justin (koavf)TCM 02:13, 29 July 2024 (UTC)[reply]
The item was based on https://id.worldcat.org/fast/405596/ for which the source seems to have been https://id.loc.gov/authorities/names/n97866648.html, but it looks like any identifier with a matching name was added. There are three people: baseball player Joseph Edward Smith (Baseball Cube player ID and Trading Card Database person ID), a musician (Discogs 4903638 and Australian Women's Register), and jazz dancer Joseph Benjamin Smith (the other identifiers; probably the intended subject). I couldn't find existing items for any of these three, but when searching I found another conflation: Joseph Edward Smith (Q105395287). Peter James (talk) 16:11, 29 July 2024 (UTC)[reply]
See Help:Conflation of two people. Best to create new items for all those people, move/copy statements to the appropriate new items and then nominate the conflation items for deletion. --2A02:810B:580:11D4:D861:7C54:F26A:7398
  • If the VIAF or LCCN is a conflation at their website, best to keep it and mark it as a conflation, so VIAF and LCCN can correct it in the future. "The original item should be kept if some identifiers remain". A bot will eventually reupload it if deleted. You can use "has parts" to point to the correct people. --RAN (talk) 20:20, 30 July 2024 (UTC)[reply]
  • I have teased apart three individuals the baseball player and two musicians, more may be needed and some may end up merged with existing entities. --RAN (talk) 22:10, 30 July 2024 (UTC)[reply]

"Default Values for All Languages" Feature - Share Your Feedback!

[edit]

Hello,

Last week, we announced that a limited release of the “default values for all languages” feature—introducing the language code "mul" for labels and aliases—will soon be coming to Wikidata. We are currently working on improvements for “mul” in the Termbox on Item pages. We’ve already received feedback from some of you on the discussion pages, but we’d also love to hear from those who prefer to provide anonymous feedback.

Please share your thoughts on this 5-10 minute anonymous survey until August 4: https://wikimedia.sslsurvey.de/Wikidata-default-values-feedback.

If you have any questions or concerns feel free to let us know in this Phabricator ticket (phab:T356169)

Many thanks for your time. -Mohammed Abdulai (WMDE) (talk) 11:40, 29 July 2024 (UTC)[reply]

@Mohammed Abdulai (WMDE)It seems I can't use the search box to search for items that only have a default label (mul). See for instance Casey Szilvia (Q128347219). The search term 'Casey Szilvia' doesn't show item.
P.S. I also wrote this further down on this page. Sabelöga (talk) 20:54, 3 August 2024 (UTC)[reply]

Wikidata weekly summary #638

[edit]

Edit request

[edit]

Hello, I wanted to add an entry to maize (Q11575), but it seems to be protected. Can any of you add مکابؤج (link) from Gilaki wikipedia (glk) to maize (Q11575)?Mehrshad Mehdi pour (talk) 09:22, 30 July 2024 (UTC)[reply]

✓ Done. --Wolverène (talk) 09:58, 30 July 2024 (UTC)[reply]
Thanks. Mehrshad Mehdi pour (talk) 10:29, 30 July 2024 (UTC)[reply]

Data donation - initial questions

[edit]

Hi Wikidata Community! I wanted to start a discussion about data donation and hopefully get some guidance on what's the best place to begin + what data would be most useful. I work with an open data platform (https://www.workwithdata.com/) and we have millions of datapoints, all from open data sources like the UN, World Bank, British Library, or Tate. We were thinking that it would be amazing to add our data to Wikidata to enrich it, especially as it all corresponds to existing items here and pages on Wikipedia. There is lots we would love to donate - data on countries' important economic & geographical metrics, books and authors, artists, politicians, etc. (all open data), but we were thinking that we could start with a smaller dataset such as artists (https://www.workwithdata.com/datasets/artists) to better understand how to match items and the relevant sitelinks. What do you think? Would love some direction and to learn how to go through all the data donation steps. AniaGrzybowska (talk) 09:49, 30 July 2024 (UTC)[reply]

I think it's better to import data from the original source. Sjoerd de Bruin (talk) 09:55, 30 July 2024 (UTC)[reply]
Hello, please see Wikidata:Data_donation M2k~dewiki (talk) 10:34, 30 July 2024 (UTC)[reply]
Thank you for the link! AniaGrzybowska (talk) 13:41, 30 July 2024 (UTC)[reply]
Yeah, I was thinking about doing that initially, but when we get all of the data from different sources, there are always differences and some missing values. We put together the sources about one item to get a more complete view, correct some incorrect or missing info in some sources and basically combine everything into datasets that cover what the UN says, plus what the World Banks says (and so on) to create an open data source that has all of those and put that into the public domain as well. I think it would be worth donating that as an additional point of reference! For example for artists, the MoMA says something different from Tate which says sth different from Rijksmuseum, etc. One museum usually misses some data which another one has (the date of death for an artist would be in MoMA's dataset but not in Tate's one). We clean up the typos, figure out the discrepancies in data formatting, check the individual datapoints. Quite similar to the World Bank actually, which also combines data from different agencies.
Thanks for the link as well! I came from that actually. It encourages to start a discussion first and figure out which data would be most useful with the community, so I came straight here. Should I then go ahead and try to prepare a data import on one dataset and go from there? I can do the artists. AniaGrzybowska (talk) 13:41, 30 July 2024 (UTC)[reply]
Also see
M2k~dewiki (talk) 13:50, 30 July 2024 (UTC)[reply]
Awesome, thank you! I'll read everything and get started on preparing the data. AniaGrzybowska (talk) 14:04, 30 July 2024 (UTC)[reply]
As your approach parallels that of WD, it might be an idea to do a SPARQL report for a set of artists and compare it with a WD set. It should be 90% in agreement, as we've probably used the same sources as you. Looking at the 10% will show if your quality control processes are superior to ours. If things look good, then we could look at using you as a reference to the facts we agree on for facts unreferenced here. Do you record for each fact where it came from, as you could pass through that information. Finally, new facts could be added to WD quoting your site, or better, the original site, as a reference. This might sound a lot of work, but if both sides are well structured, it should be automatable. Of course the reverse could be done, WD could be used to add to your datasets.
Having different projects working on the same thing introduces a refreshing set of alternative viewpoints. Vicarage (talk) 14:04, 30 July 2024 (UTC)[reply]
That's a really good point actually and it does sounds doable. We can see where all datapoints come from, so we will be able to spot-check what causes any differences between the data we have and what's already in WD. I'll test few data exports & imports to see how we can make that happen technically and we'll go from there. It might be a lot of work, yeah, but it'll be worth it in the end! We're working on something similar, but in the end we also want to contribute to the larger open data ecosystem, so adding what we have to WD and linking the sources is what it's all about.
What do you think is the best way to go about it? Once I compare the data and/or have a data import ready, would it be worth starting a new discussion here? AniaGrzybowska (talk) 15:39, 30 July 2024 (UTC)[reply]
It would be great to hear what the results of the data comparison are (either in this discussion, or a new one).
Specifically:
How many entities:
- Matched (exist in both datasets)
- Are new (and proposed for inclusion)
How many properties:
- Matched (exist in both datasets)
- Are new (and proposed for inclusion)
In both cases, would we able to enrich these existing items and new properties with their source references?
It would also be great to know how many discrepancies you uncover when matching existing items and properties.
Also, feel free to do a sample import of a handful of new items, where the community can help provide feedback on the structure. Iamcarbon (talk) 01:27, 2 August 2024 (UTC)[reply]

described at URL (P973) and tekstowo.pl (Q126379084) again

[edit]

Could someone make a batch to remove this spam from Wikidata so we can put the URL on the blacklist Trade (talk) 14:41, 1 August 2024 (UTC)[reply]

Most were added by a bot (User:Reinheitsgebot) using Mix'n'match; it has stopped and the Mix'n'match has been deleted. The links could be removed if there is consensus for that, but is there any reason links to the site should be removed and not made into a property just as AZLyrics.com artist ID (P7194) and other English-language sites are? Peter James (talk) 19:05, 1 August 2024 (UTC)[reply]
If there are already enough properties for lyrics (Wikidata:Property proposal/Tekstowo.pl artist ID) then Lyrics007 artist ID (P7206) could be deleted as according to https://www.isitdownrightnow.com/lyrics007.com.html "the site has been unreachable for more than 130 days". Peter James (talk) 19:11, 1 August 2024 (UTC)[reply]
If you wish Lyrics007 artist ID (P7206) to be deleted then you should request so. That have no bearing on this website.--Trade (talk) 05:20, 2 August 2024 (UTC)[reply]
At the same time if you wish to make a identifier proposal for tekstowo.pl the you are welcome to do so --Trade (talk) 05:21, 2 August 2024 (UTC)[reply]

@EncycloPetey, Huntster, Peter James, Lymantria:--Trade (talk) 05:23, 2 August 2024 (UTC)[reply]

duplicates: Q56297769 can be merged into Q1121708

[edit]

Q56297769 can be merged into Q1121708

If somebody can do this, I'd greatly appreciate and try to learn from it so I may do it myself next time :) TimBorgNetzWerk (talk) 12:41, 2 August 2024 (UTC)[reply]

ISBN-10 (P957) and ISBN-13 (P212) conflicts-with restriction on literary work and written work

[edit]

Does this not defeat the whole point of the property? You can't use an ISBN property on books? And yes, I know that books can have many different ISBNs, but they can have many different identifiers, so this restriction makes the property basically useless. Yes I could hypothetically make a sub-item for every single edition the book has ever had digital or otherwise but that is not useful for linking purposes. Is this really how it's supposed to be done? PARAKANYAA (talk) 15:06, 2 August 2024 (UTC)[reply]

I agree with you, but the folks at Wikidata:WikiProject_Books are very keen on this approach, even for single edition books. Vicarage (talk) 15:20, 2 August 2024 (UTC)[reply]
That conflict is correct. ISBN values are assigned to specific editions of publications, not to literary works. There is no sensible way to assign an ISBN to Harry Potter and the Philosopher's Stone (Q43361) when there are hundreds of different editions and translations, each with their own ISBN values. For example Harry Potter and the Philosopher’s Stone (Q58242028) has an ISBN, because it is the 2014 paperback edition from the United Kingdom. And Harry Potter and the Philosopher's Stone (French edition) (Q58464836) has an ISBN because it is the 1998 French edition published by Éditions Gallimard. Each edition has its own ISBN value, so the ISBN is placed on the data item for the specific edition to which it applies. We don't put it on the main item for the literary work, because that would lead to multiple conflicting ISBN values on a data item. And each of those editions has its own publisher, publication date, editor, language, place of publication, etc., and there needs to be a separate data item to tie all of that information together. The system adopted here is modelled on the international system used by libraries to keep track of such things. --EncycloPetey (talk) 16:54, 2 August 2024 (UTC)[reply]
Well, I think that is dumb, but at least I know it's on purpose now. I'll just remove that property from works then PARAKANYAA (talk) 17:06, 2 August 2024 (UTC)[reply]
If you believe you can develop a superior system, I recommend writing it up and publishing in a journal of library science. The current system was developed through centuries of accumulated knowledge by hundreds of experts working together. But there is always the possibility someone will improve upon the current system. --EncycloPetey (talk) 19:23, 2 August 2024 (UTC)[reply]

Identical label and desc but no error?

[edit]

Could someone figure out, how Wikidata system allows Qs:

Estopedist1 (talk) 16:24, 2 August 2024 (UTC)[reply]

There are Estonian articles for each and (based on machine translation to English) they appear to refer to different municipalities with the same name, or the same municipality that had parts in different historical administrative territories. It may be the case that the English labels and descriptions need to be updated to better distinguish them. William Graham (talk) 17:08, 2 August 2024 (UTC)[reply]
They both had their label and description changed on the same minute June 22 2024 to the same values, no idea why one of the edits wasn't rejected. William Graham (talk) 17:25, 2 August 2024 (UTC)[reply]
@William Graham: yep, Wikidata bug. I changed one value and now I cannot change it back. Thanks for commenting! Estopedist1 (talk) 20:03, 2 August 2024 (UTC)[reply]

Property proposal for person/lifespan

[edit]

I'm attempting to create my very first property proposal (https://www.wikidata.org/wiki/Wikidata:Property_proposal/lifespan), and have undoubtedly thoroughly messed it up, but hopefully not irretrievably. I wish to start recording as a property of certain persons, their lifespan as recorded on funerary monuments. E.g., so-and-so lived X years, Y months, and Z days. This requires, what appears to me, a data type not otherwise used in WikiData, namely a time duration. The existing Time data type merely represents a single point in time, and thus is not suitable here. For the moment, I've selected 'monolingual text' as the data type, with a regex constraining the allowed values (conforming to the ISO 8601 standard for such values). But I've no idea whether I've done any of this correctly. If someone more knowledgeable in how to craft property proposals could kindly critique my effort and offer suggestions on how to improve it, I would be most grateful. Thank you. Sarcanon (talk) 00:42, 3 August 2024 (UTC)[reply]

BTW, before anyone asks, the persons for which I will be adding this information will be from classical antiquity, and by convention, dates of birth and/or dates of death were never recorded, whereas the person's lifespan frequently was. Sarcanon (talk) 01:03, 3 August 2024 (UTC)[reply]
There's age of subject at event (P3629) which can be used as a qualifier on an unknown date of death. Ghouston (talk) 05:28, 3 August 2024 (UTC)[reply]

KUOW

[edit]

KUOW (Q6339681) is just a separately licensed transmitter for KUOW-FM (Q6339679). No programming content of its own, really just a repeater. I suppose it still merits a separate item, but I suspect the two items should somehow be related to one another, which they seem not to be currently. - Jmabel (talk) 05:42, 3 August 2024 (UTC)[reply]

dementia (Q83030)

[edit]

Please add [Overtreatment of diabetes in older people] to dementia risk factor list. See: https://pubmed.ncbi.nlm.nih.gov/39093563/. Thanks Nirts (talk) 07:26, 3 August 2024 (UTC)[reply]

Hi. How does this works. Should they be merged or not? And if not perhaps someone can tell me why? MGA73 (talk) 15:18, 3 August 2024 (UTC)[reply]

Q1301762 is the main item, Q21451891 is an instance of Wikimedia category (Q4167836) and exists to connect any categories to each other, and link them to the main item with the property category's main topic (P301). They shouldn't be merged, but if it only had a Commons category, that category could be moved to the main item and Q21451891 could be deleted (the Commons sitelink is in a category item if a separate item is necessary). I don't know if the Japanese Wikibooks category can also be moved, as other Wikibooks categories I checked were only connected to category items or were not in Wikidata. Peter James (talk) 15:48, 3 August 2024 (UTC)[reply]
Hello, only objects of the same type should be merged resp. connected to each other.
  • articles with same articles in various languages (including galleries on commons), describing the same entity (a person, a film, a geographical object, a taxon, ...)
  • categories with categories (including categories on commons)
  • disambiguation items with disambiguation items
  • family name items with family name items
  • firstname items with with first name items
  • lists items with list items
  • ...
The different objects can be cross-referenced (e.g. list related to category, main topic of the category, ...), so its easier to navigation. On Commons, the wikidata infobox shows the content of the item of the main topic when the commonscat is connected to a category item and the two items are cross-referenced.
Often, commonscats are directly connected to the item of main topic when there is no commonsgallery and not yet an item for the category.
M2k~dewiki (talk) 20:01, 3 August 2024 (UTC)[reply]
Also see Help:Merge M2k~dewiki (talk) 20:04, 3 August 2024 (UTC)[reply]

Searching for default labels (mul)

[edit]

It seems I can't use the search box to search for items that only have a default label (mul). See for instance Casey Szilvia (Q128347219). The search term 'Casey Szilvia' doesn't show the item. Sabelöga (talk) 20:53, 3 August 2024 (UTC)[reply]

Also see Help:Default_values_for_labels_and_aliases#Where_can_I_report_problems? -> Help_talk:Default_values_for_labels_and_aliases#Searching_for_default_labels_(mul) M2k~dewiki (talk) 19:35, 4 August 2024 (UTC)[reply]

Additional languages cluttering suggestions

[edit]

No doubt this is supposed to be helpful, but it is not. When attempting to type in a field, I don't need Finnish or Spanish suggestions showing up alongside English. Disable it immediately. Abductive (talk) 08:12, 4 August 2024 (UTC)[reply]

Circa

[edit]

How to add note to value for population (P1082) if value is circa (Q5727902). Eurohunter (talk) 23:26, 4 August 2024 (UTC)[reply]

Can you give the specific instance you're planning on using? E.g. if it was c. 1709, then you could use "1700s" and just have that level of precision. —Justin (koavf)TCM 23:43, 4 August 2024 (UTC)[reply]
@Koavf: population (P1082) has value 8500 but how to indicate that it's circa (Q5727902) (more or less than 8500). Eurohunter (talk) 23:50, 4 August 2024 (UTC)[reply]
Sorry, I misunderstood. You can add sourcing circumstances (P1480) with circa (Q5727902). —Justin (koavf)TCM 23:59, 4 August 2024 (UTC)[reply]