IRIs, URIs and URLs

来源:互联网 发布:剑雨昆仑进阶最新数据 编辑:程序博客网 时间:2024/04/28 23:36

转自:http://jero.net/articles/iris-uris-urls

 

You've probably heard about them before (or at least the last two):

  1. Internationalized Resource Identifier (IRI)
  2. Uniform Resource Identifier (URI)
  3. Uniform Resource Locator (URL)

There's also a high chance that you know what the three are about, but what they exactly are is probably not something most people know. But fear not! This article will give you your answers.

URL is probably the most well known term of the three. What a URL is will probably be something you already know, but what most people don't know is that when we're talking about URLs, we're also talking about URIs. It doesn't work the other way around though, so lets see what a URI really is. But before we do that, lets take a look at what the God of the internet has to say about URIs:

A URI can be classified as a locator, a name, or both. A Uniform Resource Locator (URL) is a URI that, in addition to identifying a resource, provides a means of acting upon or obtaining a representation of the resource by describing its primary access mechanism or network "location". For example, the URL http://www.wikipedia.org/ is a URI that identifies a resource (Wikipedia's home page) and implies that a representation of that resource (such as the home page's current HTML code, as encoded characters) is obtainable via HTTP from a network host named www.wikipedia.org.

So lets summarize that: A URI can be a URL, but it can also be different than giving access to a resource like a webpage. We a URI points to a resource, we call it a URL or a locator for short. However, a URI can also be a name instead of a locator. When a URI is a name we can also call it a URN. If you want more information on URNs, Wikipedia is the place to be.

So a URI can be split into two parts:

  1. URL (locator);
  2. and URN (name).

But when you look at the real world you'll see that when people provide a URI, 99,9% of the time that URI is a URL, not a URN. However, it is recommended that one should use URI instead of URL, so that's what I'll do from now on.

Now that we got that done, lets go to IRIs. IRIs are new, and revolutionary! If you, again, go back to the first paragraph, you will see that the first letter of IRI standards for "Internationalized". What it means is that the IRI always has a Unicode character encoding. Now that is interesting, especially in this time, where the internet and computers are spread around the entire world with a lot of different languages. And as you know, not every language uses the same alphabet as the English language does. French, for example, uses a lot of characters like ê and é. But those characters can not be used in URIs; the standard we use now.

That's why they came up with IRI. It allows you to use any character without percent-encoding (%20 = space) because IRIs are always Unicode. As you can imagine, that really adds a lot of value for languages like Japanese because a Japanese website can use Japanese characters in his IRI which increases the accessiblity if the main language of the document is indeed Japanese. That difference is actually the only difference with the URI standard, but as you see, a very important one. Hopefully we'll be able to use IRIs in the near future because current applications are incapable of handling IRIs, so we need to wait for these bitches to fully comply to the new standard. In other words: don't count on using it within the next couple of years.

 

原创粉丝点击