`gomnd` disabled, as it complains about every number used in the code,
and `wsl` became much more thorough.
Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
Reboot test does node-by-node reboots followed by cluster health checks
(same as done by provisioner).
Fixed bug with `Read()` returning `Reader` instead of `ReadCloser`
(minor).
Allowed `bootkube` to be `Skipped` (for rebooted node).
Added support for doing checks via provided client instance.
Implemented generic capabilities to skip tests based on cluster
platform.
Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
The `client.Creds` struct was not used very often, and made using the
`client.NewClient` function impossible to use in combination with the
`RemoteRenewingFileCertificateProvider`. This modifies
`client.NewClient` to accept a `tls.Config` instead of `client.Creds`,
allowing for the use of `RemoteRenewingFileCertificateProvider` with
`client.NewClient`.
Signed-off-by: Andrew Rynhard <andrew@andrewrynhard.com>
This extracts Docker Talos cluster provisioner as common code
which might be shared between `osctl cluster` and integration-test.
There should be almost no functional changes.
As proof of concept, abstract cluster readiness checks were implemented
based on provisioned cluster state. It implements same checks as
`basic-integration.sh` in pure Go via Talos/K8s clients.
`conditions` package was promoted from machined-internal to
`internal/pkg` as it is used to run the checks.
Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
Fixes#1563
This implements dmesg reading via `/dev/kmsg`, with message parsing and
formatting. Kernel log facility and severity are parsed, timestamp is
calculated relative to boot time (it's accurate unless time jumps a
lot during node lifetime).
New flags to follow dmesg was added, tail flag allows to stream only new
message (ignoring old messages). We could try to implement tailing last
N messages, just a bit more work, open to suggestions (for symmetry with
regular logs).
Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
This PR brings our protobuf files into conformance with the protobuf
style guide, and community conventions. It is purely renames, along with
generated docs.
Signed-off-by: Andrew Rynhard <andrew@andrewrynhard.com>
Fixes#1610
1. In `talosconfig`, deprecate `Target` in favor of `Endpoints`
(client-side LB to come next).
2. In `osctl`, use `--nodes` in place of `--target`.
3. In `osctl` add option `--endpoints` to override `Endpoints` for the
call.
Other changes are just updates to catch up with the changes. Most
probably I missed something... And CAPI provider needs update.
Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
This is to prepare for upcoming switch to reading `/dev/kmsg` which
should allow following logs, doing some kind of tail, etc.
The output is far from being perfect, as `dmesg` data is delivered as
single chunk (not as lines), but once server side updates, client side
should match it.
Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
There are several changes which cleanup and address features of osctl,
mostly for multi-node requests:
* responses are filtered, so that client commands can print partial
failures/success responses;
* `RunE` is used in place of `Run` to propagate correct return sequence
on failures;
* cleaned up setting `targets` metadata on outgoing requests, it is set
by default in `globalCtx` already
Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
Problem seems to be on multiple levels, and there are a bit of changes
which got mixed in from another PR (just same file changed).
Core of the issue is that `helpers.Fatalf()` calls `os.Exit()` which
terminates execution and doesn't let the `defer` and other handlers to
run. This uses Cobra feature of error propagation to pop errors through
the stack back to root command.
Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
Now default is not to follow the logs (which is similar to `kubectl logs`).
Integration test was added for `Logs()` API and `osctl logs` command.
Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
This PR only touches `Version` method, but I will expand it to other
methods in the next PR.
When proxying to many upstreams, errors are wrapped as responses as we
can't return error and response from grpc call. Reflect-based function
was introduced to filter out responses which contain errors as
multierror. Reflection was used, as each response is a different Go
type, and we can't write a generic function for it.
osctl was updated to support having both resp & err not nil. One failed
response shouldn't result in error.
Re-enabled integration test for multiple targets and version
consistency, need e2e validation.
Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
This change is pretty mechanical, just wrap every API so that remote
peer address is used as default for `resp.Metadata.Hostname`.
This makes `NODE:` non-empty in all the API calls.
Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
This replaces codegen version of apid proxying with
talos-systems/grpc-proxy based version. Proxying is transparent, it
doesn't require exact information about methods and response types. It
requires some common layout response to enhance it properly with node
metadata or errors.
There should be no signifcant changes to the API with the previous
version, but it's worth mentioning a few changes:
1. grpc.ClientConn is established just once per upstream (either local
service or remote apid instance).
2. When called without `-t` (`targets`), apid proxies immediately down
to local service skipping proxying to itself (as before), which results
in empty node metadata in response (before it had local node IP). Might
revert this later to proxy to itself (?).
3. Streaming APIs are now fully supported with multiple targets, but
message definition doesn't contain `ResponseMetadata`, so streaming APIs
are broken now with targets (needs a fix).
4. Errors are now returned as responses with `Error` field set in
`ResponseMetadata`, this requires client library update and `osctl` to
handle it properly.
Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
This moves the Kubeconfig api endpoint to machined and consolidates the
"read a file" code into machined. This also changes Kubeconfig to
use the CopyOut method which changes Kubeconfig to a streaming grpc call.
Signed-off-by: Brad Beam <brad.beam@talos-systems.com>
- Added common.proto to host NodeMetadata
- go_package names were fixed up so imports are generated with the proper
package names
- fixed up build work (dockerfile) to prevent copying the previously
generated go proto files. This fixes a bug where we could incorrectly
copy the previously generated protobuf instead of a new one generated
at an incorrect location/name/etc.
Signed-off-by: Brad Beam <brad.beam@talos-systems.com>
This enables the ability to specify additional <talos> endpoints to connect to
to pull back data.
Signed-off-by: Brad Beam <brad.beam@talos-systems.com>
This removes the github.com/pkg/errors package in favor of the official
error wrapping in go 1.13.
Signed-off-by: Andrew Rynhard <andrew@andrewrynhard.com>
Memory usage reduced around 8-10x: now it stays stable at 1GB.
I disabled some of the new linters, and one rule which is violated a
lot.
I might make sense to go back and enable `wsl` fixing all the issues
(leaving that for another PR).
Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
This detangles the gRPC client code from the userdata code. The
motivation behind this is to make creating clients more simple and not
dependent on our configuration format.
Signed-off-by: Andrew Rynhard <andrew@andrewrynhard.com>
This makes working with the API much cleaner as a client. Using gob
doesn't give the client a well-known type to work with in the API
definition.
Signed-off-by: Andrew Rynhard <andrew@andrewrynhard.com>
In order for other projects to make use of our APIs, they must not
reside underneath the internal directory. This moves the protobuf
definitions to a top-level "api" directory and scopes them according to
their domain. This change also removes generated code from the gitignore
file so that users don't have to generate the code themseleves.
Signed-off-by: Andrew Rynhard <andrew@andrewrynhard.com>
This implements 'default deny' policy for service operations via the
API: services do not allow operations.
Service whitelists itself for stop/start/restart by implementing the
interface and returning boolean flag which might depend on userdata.
Machined APIs `Stop/Start` were renamed to `ServiceStop`/`ServiceStart`
to avoid confusion with osd API `Restart` which is not related to
services. Old APIs are deprecated and compatibility code forwards old
APIs to the new code.
`ServiceRestart` API was introduced to distinguish restart action from
stop/start (previously restart was implemented as stop+start in the
CLI).
Service udevd-trigger was whitelisted for all operations (allows
stopping hanging run, restarting to trigger once again).
Services proxyd & ntpd were whitelisted for restart and start (start is
whitelisted to help with service stuck in stopped state while restarting).
Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
It is now possible to `start`/`stop`/`restart` any service via `osctl`
commands.
There are some changes in `ServiceRunner` to support re-use (re-entering
running state). `Services` singleton now tracks service running state to
avoid calling `Start()` on already running `ServiceRunner` instance.
Method `Start()` was renamed to `LoadAndStart()` to break up service
loading (adding to the list of service) and actual service start.
Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
This change allows for more accurate mount reporting as /proc/mounts is
a symlink to /proc/self/mounts and contains mounts that are relative to
the running process. In our case this was osd. This caused inaccurate
reporting of mounts since they were relative to osd when we really
wanted mounts relative to machined.
Signed-off-by: Andrew Rynhard <andrew@andrewrynhard.com>
This PR moves the reset API to the init API definition.
It leverages the same code we use for upgrades.
Signed-off-by: Andrew Rynhard <andrew@andrewrynhard.com>
Service `osd` doesn't have access to rootfs, as it is running in a
container, so move API to `init` which has unconstrained access to
rootfs. (This is in line with another API, `osctl cp`).
Fixes: #752
Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
Actual API is implemented in the `init`, as it has access to root
filesystem. `osd` proxies API back to `init` with some tricks to support
grpc streaming.
Given some absolute path, `init` produces and streams back .tar.gz
archive with filesystem contents.
`osctl cp` works in two modes. First mode streams data to stdout, so
that we can do e.g.: `osctl cp /etc - | tar tz`. Second mode extracts
archive to specified location, dropping ownership info and adjusting
permissions a bit. Timestamps are not preserved.
If full dump with owner/permisisons is required, it's better to stream
data to `tar xz`, for quick and dirty look into filesystem contents
under unprivileged user it's easier to use in-place extraction.
Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>
I couldn't find any use for the `timeout` flag nor the value passed in
the API, but it block much more useful and present in other commands
flag 'target'.
Signed-off-by: Andrey Smirnov <smirnov.andrey@gmail.com>