mirror of
				https://github.com/optim-enterprises-bv/kubernetes.git
				synced 2025-11-04 04:08:16 +00:00 
			
		
		
		
	Beef up services doc a little
Convert examples to v1beta3, flesh out some details, mention L7.
This commit is contained in:
		
							
								
								
									
										149
									
								
								docs/services.md
									
									
									
									
									
								
							
							
						
						
									
										149
									
								
								docs/services.md
									
									
									
									
									
								
							@@ -40,24 +40,32 @@ port 9376 and carry a label "app=MyApp".
 | 
			
		||||
 | 
			
		||||
```json
 | 
			
		||||
{
 | 
			
		||||
  "kind": "Service",
 | 
			
		||||
  "apiVersion": "v1beta1",
 | 
			
		||||
  "id": "myapp",
 | 
			
		||||
  "selector": {
 | 
			
		||||
    "app": "MyApp"
 | 
			
		||||
  },
 | 
			
		||||
  "containerPort": 9376,
 | 
			
		||||
  "protocol": "TCP",
 | 
			
		||||
  "port": 80
 | 
			
		||||
    "kind": "Service",
 | 
			
		||||
    "apiVersion": "v1beta3",
 | 
			
		||||
    "metadata": {
 | 
			
		||||
        "name": "my-service"
 | 
			
		||||
    },
 | 
			
		||||
    "selector": {
 | 
			
		||||
        "app": "MyApp"
 | 
			
		||||
    },
 | 
			
		||||
    "spec": {
 | 
			
		||||
        "ports": [
 | 
			
		||||
            {
 | 
			
		||||
                "protocol": "TCP",
 | 
			
		||||
                "port": 80,
 | 
			
		||||
                "targetPort": 9376
 | 
			
		||||
            }
 | 
			
		||||
        ]
 | 
			
		||||
    }
 | 
			
		||||
}
 | 
			
		||||
```
 | 
			
		||||
 | 
			
		||||
This specification will create a new `Service` object named "myapp" which
 | 
			
		||||
This specification will create a new `Service` object named "my-service" which
 | 
			
		||||
targets TCP port 9376 on any `Pod` with the "app=MyApp" label.  Every `Service`
 | 
			
		||||
is also assigned a virtual IP address (called the "portal IP"), which is used by
 | 
			
		||||
the service proxies (see below).  The `Service`'s selector will be evaluated
 | 
			
		||||
continuously and the results will be posted in an `Endpoints` object also named
 | 
			
		||||
"myapp".
 | 
			
		||||
"my-service".
 | 
			
		||||
 | 
			
		||||
### Services without selectors
 | 
			
		||||
 | 
			
		||||
@@ -73,23 +81,48 @@ abstract any kind of backend.  For example:
 | 
			
		||||
In any of these scenarios you can define a service without a selector:
 | 
			
		||||
 | 
			
		||||
```json
 | 
			
		||||
  "kind": "Service",
 | 
			
		||||
  "apiVersion": "v1beta1",
 | 
			
		||||
  "id": "myapp",
 | 
			
		||||
  "port": 80
 | 
			
		||||
{
 | 
			
		||||
    "kind": "Service",
 | 
			
		||||
    "apiVersion": "v1beta3",
 | 
			
		||||
    "metadata": {
 | 
			
		||||
        "name": "my-service"
 | 
			
		||||
    },
 | 
			
		||||
    "spec": {
 | 
			
		||||
        "ports": [
 | 
			
		||||
            {
 | 
			
		||||
                "protocol": "TCP",
 | 
			
		||||
                "port": 80,
 | 
			
		||||
                "targetPort": 9376
 | 
			
		||||
            }
 | 
			
		||||
        ]
 | 
			
		||||
    }
 | 
			
		||||
}
 | 
			
		||||
```
 | 
			
		||||
 | 
			
		||||
Then you can explicitly map the service to a specific endpoint(s):
 | 
			
		||||
Then you can manually map the service to a specific endpoint(s):
 | 
			
		||||
 | 
			
		||||
```json
 | 
			
		||||
  "kind": "Endpoints",
 | 
			
		||||
  "apiVersion": "v1beta1",
 | 
			
		||||
  "id": "myapp",
 | 
			
		||||
  "endpoints": ["173.194.112.206:80"]
 | 
			
		||||
{
 | 
			
		||||
    "kind": "Endpoints",
 | 
			
		||||
    "apiVersion": "v1beta3",
 | 
			
		||||
    "metadata": {
 | 
			
		||||
        "name": "my-service"
 | 
			
		||||
    },
 | 
			
		||||
    "subsets": [
 | 
			
		||||
        {
 | 
			
		||||
            "addresses": [
 | 
			
		||||
                { "IP": "1.2.3.4" }
 | 
			
		||||
            ],
 | 
			
		||||
            "ports": [
 | 
			
		||||
                { "port": 80 }
 | 
			
		||||
            ]
 | 
			
		||||
        }
 | 
			
		||||
    ]
 | 
			
		||||
}
 | 
			
		||||
```
 | 
			
		||||
 | 
			
		||||
Accessing a `Service` without a selector works the same as if it had selector. The
 | 
			
		||||
traffic will be routed to endpoints defined by the user (`173.194.112.206:80` in
 | 
			
		||||
Accessing a `Service` without a selector works the same as if it had selector.
 | 
			
		||||
The traffic will be routed to endpoints defined by the user (`1.2.3.4:80` in
 | 
			
		||||
this example).
 | 
			
		||||
 | 
			
		||||
## Portals and service proxies
 | 
			
		||||
@@ -100,8 +133,9 @@ and `Endpoints` objects. For each `Service` it opens a port (random) on the
 | 
			
		||||
local node.  Any connections made to that port will be proxied to one of the
 | 
			
		||||
corresponding backend `Pods`.  Which backend to use is decided based on the
 | 
			
		||||
AffinityPolicy of the `Service`.  Lastly, it installs iptables rules which
 | 
			
		||||
capture traffic to the `Service`'s `Port` on the `Service`'s portal IP and
 | 
			
		||||
redirects that traffic to the previously described port.
 | 
			
		||||
capture traffic to the `Service`'s `Port` on the `Service`'s portal IP (which
 | 
			
		||||
is entirely virtual) and redirects that traffic to the previously described
 | 
			
		||||
port.
 | 
			
		||||
 | 
			
		||||
The net result is that any traffic bound for the `Service` is proxied to an
 | 
			
		||||
appropriate backend without the clients knowing anything about Kubernetes or
 | 
			
		||||
@@ -112,6 +146,9 @@ appropriate backend without the clients knowing anything about Kubernetes or
 | 
			
		||||
By default, the choice of backend is random.  Client-IP-based session affinity
 | 
			
		||||
can be selected by setting `service.spec.sessionAffinity` to  `"ClientIP"`.
 | 
			
		||||
 | 
			
		||||
As of Kubernetes 1.0, `Service`s are a "layer 3" (TCP/UDP over IP) construct.  We do not
 | 
			
		||||
yet have a concept of "layer 7" (HTTP) services.
 | 
			
		||||
 | 
			
		||||
### Why not use round-robin DNS?
 | 
			
		||||
 | 
			
		||||
A question that pops up every now and then is why we do all this stuff with
 | 
			
		||||
@@ -174,19 +211,20 @@ virtual portal IP.
 | 
			
		||||
 | 
			
		||||
## Headless Services
 | 
			
		||||
 | 
			
		||||
Users can create headless services by specifying "None" for the PortalIP.
 | 
			
		||||
For such services, an IP or DNS is not created and neither are service-specific
 | 
			
		||||
environment variables for the pods created. Additionally, the kube proxy does not
 | 
			
		||||
handle these services and there is no load balancing or proxying being done by the
 | 
			
		||||
platform for them. The endpoints_controller would still create endpoint records in
 | 
			
		||||
etcd for such services, which are also made available via the API. These services
 | 
			
		||||
also take advantage of any UI, readiness probes, etc. that are applicable for
 | 
			
		||||
services in general. 
 | 
			
		||||
Sometimes you don't need or want a single virtual IP.  In this case, you can
 | 
			
		||||
create "headless" services by specifying "None" for the PortalIP.  For such
 | 
			
		||||
services, a virtual IP is not allocated, DNS is not configured (this will be
 | 
			
		||||
fixed), and service-specific environment variables for pods are not created.
 | 
			
		||||
Additionally, the kube proxy does not handle these services and there is no
 | 
			
		||||
load balancing or proxying done by the platform for them. The endpoints
 | 
			
		||||
controller will still create endpoint records in the API for such services.
 | 
			
		||||
These services also take advantage of any UI, readiness probes, etc. that are
 | 
			
		||||
applicable for services in general.
 | 
			
		||||
 | 
			
		||||
The tradeoff for a developer would be whether to couple to the Kubernetes API or to
 | 
			
		||||
a particular discovery system. This API would not preclude the self-registration
 | 
			
		||||
approach, however, and adapters for other discovery systems could be built upon this
 | 
			
		||||
API, as well.
 | 
			
		||||
The tradeoff for a developer would be whether to couple to the Kubernetes API
 | 
			
		||||
or to a particular discovery system. Applications can still use a
 | 
			
		||||
self-registration pattern and adapters for other discovery systems could be
 | 
			
		||||
built upon this API, as well.
 | 
			
		||||
 | 
			
		||||
## External Services
 | 
			
		||||
 | 
			
		||||
@@ -195,8 +233,8 @@ Service onto an external (outside of your cluster, maybe public internet) IP
 | 
			
		||||
address.
 | 
			
		||||
 | 
			
		||||
On cloud providers which support external load balancers, this should be as
 | 
			
		||||
simple as setting the `createExternalLoadBalancer` flag of the `Service` to
 | 
			
		||||
`true`.  This sets up a cloud-specific load balancer and populates the
 | 
			
		||||
simple as setting the `createExternalLoadBalancer` flag of the `Service` spec
 | 
			
		||||
to `true`.  This sets up a cloud-specific load balancer and populates the
 | 
			
		||||
`publicIPs` field (see below).  Traffic from the external load balancer will be
 | 
			
		||||
directed at the backend `Pods`, though exactly how that works depends on the
 | 
			
		||||
cloud provider.
 | 
			
		||||
@@ -209,19 +247,24 @@ through to the backends.  You are then responsible for ensuring that traffic to
 | 
			
		||||
those IPs gets sent to one or more Kubernetes `Nodes`.  As long as the traffic
 | 
			
		||||
arrives at a Node, it will be be subject to the iptables rules.
 | 
			
		||||
 | 
			
		||||
An example situation might be when a `Node` has both internal and an external
 | 
			
		||||
network interfaces.  If you assign that `Node`'s external IP as a `publicIP`, you
 | 
			
		||||
can then aim traffic at the `Service` port on that `Node` and it will be proxied
 | 
			
		||||
to the backends.
 | 
			
		||||
An common situation is when a `Node` has both internal and an external network
 | 
			
		||||
interfaces.  If you put that `Node`'s external IP in `publicIPs`, you can
 | 
			
		||||
then aim traffic at the `Service` port on that `Node` and it will be proxied to
 | 
			
		||||
the backends.  If you set all `Node`s' external IPs as `publicIPs` you can then
 | 
			
		||||
reach a `Service` through any `Node`, which means you can build your own
 | 
			
		||||
load-balancer or even just use DNS round-robin.  The downside to this approach
 | 
			
		||||
is that all such `Service`s share a port space - only one of them can have port
 | 
			
		||||
80, for example.
 | 
			
		||||
 | 
			
		||||
## Choosing your own PortalIP address
 | 
			
		||||
A user can specify their own ```PortalIP``` address as part of a service creation
 | 
			
		||||
request.  For example, if they already have an existing DNS entry that they wish
 | 
			
		||||
to replace, or legacy systems that are configured for a specific IP address and difficult
 | 
			
		||||
to re-configure.  The ```PortalIP``` address that a user chooses must be a valid IP address
 | 
			
		||||
and within the portal net CIDR range that is specified by flag to the API server.  If
 | 
			
		||||
the PortalIP value is invalid, the apiserver returns a ```422``` HTTP status code to
 | 
			
		||||
indicate that the value is invalid.
 | 
			
		||||
 | 
			
		||||
A user can specify their own `PortalIP` address as part of a service creation
 | 
			
		||||
request.  For example, if they already have an existing DNS entry that they
 | 
			
		||||
wish to replace, or legacy systems that are configured for a specific IP
 | 
			
		||||
address and difficult to re-configure.  The `PortalIP` address that a user
 | 
			
		||||
chooses must be a valid IP address and within the portal net CIDR range that is
 | 
			
		||||
specified by flag to the API server.  If the PortalIP value is invalid, the
 | 
			
		||||
apiserver returns a 422 HTTP status code to indicate that the value is invalid.
 | 
			
		||||
 | 
			
		||||
## Shortcomings
 | 
			
		||||
 | 
			
		||||
@@ -232,6 +275,7 @@ portals](https://github.com/GoogleCloudPlatform/kubernetes/issues/1107) for more
 | 
			
		||||
details.
 | 
			
		||||
 | 
			
		||||
Using the kube-proxy obscures the source-IP of a packet accessing a `Service`.
 | 
			
		||||
This makes some kinds of firewalling impossible.
 | 
			
		||||
 | 
			
		||||
## Future work
 | 
			
		||||
 | 
			
		||||
@@ -243,11 +287,14 @@ portal will simply transport the packets there.
 | 
			
		||||
There's a
 | 
			
		||||
[proposal](https://github.com/GoogleCloudPlatform/kubernetes/issues/3760) to
 | 
			
		||||
eliminate userspace proxying in favor of doing it all in iptables.  This should
 | 
			
		||||
perform better, though is less flexible than arbitrary userspace code.
 | 
			
		||||
perform better and fix the source-IP obfuscation, though is less flexible than
 | 
			
		||||
arbitrary userspace code.
 | 
			
		||||
 | 
			
		||||
We hope to make the situation around external load balancers and public IPs
 | 
			
		||||
simpler and easier to comprehend.
 | 
			
		||||
 | 
			
		||||
We intend to have first-class support for L7 (HTTP) `Service`s.
 | 
			
		||||
 | 
			
		||||
## The gory details of portals
 | 
			
		||||
 | 
			
		||||
The previous information should be sufficient for many people who just want to
 | 
			
		||||
@@ -270,7 +317,7 @@ ensure that no two `Services` can collide.  We do that by allocating each
 | 
			
		||||
 | 
			
		||||
Unlike `Pod` IP addresses, which actually route to a fixed destination,
 | 
			
		||||
`Service` IPs are not actually answered by a single host.  Instead, we use
 | 
			
		||||
`iptables` (packet processing logic in Linux) to define "virtual" IP addresses
 | 
			
		||||
`iptables` (packet processing logic in Linux) to define virtual IP addresses
 | 
			
		||||
which are transparently redirected as needed.  We call the tuple of the
 | 
			
		||||
`Service` IP and the `Service` port the `portal`.  When clients connect to the
 | 
			
		||||
`portal`, their traffic is automatically transported to an appropriate
 | 
			
		||||
 
 | 
			
		||||
		Reference in New Issue
	
	Block a user