kubeedge-counter-demo(三) 2020-12-08 实验 3 条评论 4988 次阅读 [上一个BUG](https://www.proup.club/index.php/archives/658/)没解决完。暂时保存一下昨天排错的过程。 [TOC] ## 让edge连上k8s **Cloud**: ``` # 把k8s的控制面端口暴露出来 ssh -Nfg -L 0.0.0.0:6443:172.18.0.2:6443 192.168.56.103 # 把k8s的Server端配置文件复制到edge端 scp -r /root/.kube/config root@192.168.56.101:/root/.kube/config ``` **Edge:** ``` vim /root/.kube/config ``` 把cluster→server的ip地址,从https://127.0.0.1:xxxx改成192.168.56.103:6443,保存退出 > 运行kubectl verison试试有没有连上 > > ``` > root@kubeedge:~# root@kubeedge:~# kubectl version > Client Version: version.Info{Major:"1", Minor:"18", GitVersion:"v1.18.10", GitCommit:"62876fc6d93e891aa7fbe19771e6a6c03773b0f7", GitTreeState:"clean", BuildDate:"2020-10-15T01:52:24Z", GoVersion:"go1.13.15", Compiler:"gc", Platform:"linux/amd64"} > Unable to connect to the server: x509: certificate is valid for 10.96.0.1, 172.18.0.2, 127.0.0.1, not 192.168.56.103 > ``` ### ~~给证书增加ip~~ 看看k8s master节点的证书文件 ``` # 进入k8s master节点 docker exec -it kind-control-plane /bin/bash # 看证书 openssl x509 -in /etc/kubernetes/pki/apiserver.crt -noout -text ``` ### 重新弄Kind集群 **Cloud:** ``` kind delete cluster rm /root/kind.yaml vim /root/kind.yaml ``` 写入以下内容:(192.168.56.103是Cloud相对于Edge的IP) ``` kind: Cluster apiVersion: kind.x-k8s.io/v1alpha4 networking: apiServerAddress: "192.168.56.103" apiServerPort: 6443 nodes: - role: control-plane image: kindest/node:v1.19.3 extraPortMappings: - containerPort: 80 hostPort: 80 protocol: TCP - containerPort: 8080 hostPort: 8080 protocol: TCP ``` 由于之前把192.168.56.103:6443弄ssh内网穿透了,所以最好重启一下,来把这个端口占用取消掉。 ``` # 创建k8s master节点 kind create cluster --config=/root/kind.yaml # 重新把k8s的Server端配置文件复制到edge端 scp -r /root/.kube/config root@192.168.56.101:/root/.kube/config ``` **Edge:** ``` kubectl version ``` > ``` > root@kubeedge:~/.kube# kubectl version > Client Version: version.Info{Major:"1", Minor:"18", GitVersion:"v1.18.10", GitCommit:"62876fc6d93e891aa7fbe19771e6a6c03773b0f7", GitTreeState:"clean", BuildDate:"2020-10-15T01:52:24Z", GoVersion:"go1.13.15", Compiler:"gc", Platform:"linux/amd64"} > Server Version: version.Info{Major:"1", Minor:"19", GitVersion:"v1.19.3", GitCommit:"1e11e4a2108024935ecfcb2912226cedeafd99df", GitTreeState:"clean", BuildDate:"2020-11-13T02:48:43Z", GoVersion:"go1.15.2", Compiler:"gc", Platform:"linux/amd64"} > ``` > > 总算连上了。 到这,edge已经连上k8s了。 ### 后序工作 #### 复位Edgecore ``` rm /var/lib/kubeedge/edgecore.db ``` #### 给k8s master 安装基础工具 ``` docker exec -it kind-control-plane /bin/bash ``` ##### 看系统版本 ``` cat /etc/issue ``` > ``` > root@kind-control-plane:/# cat /etc/issue > Ubuntu Groovy Gorilla (development branch) \n \l > ``` > > 到`[ubuntu | 镜像站使用帮助 | 清华大学开源软件镜像站 | Tsinghua Open Source Mirror](https://mirror.tuna.tsinghua.edu.cn/help/ubuntu/) ,选版本20.10。 ##### 改软件源,安装net-tools和vim ``` mv /etc/apt/sources.list /etc/apt/sources.list.bak echo "deb https://mirrors.tuna.tsinghua.edu.cn/ubuntu/ groovy main restricted universe multiverse" >/etc/apt/sources.list echo "deb http://mirrors.tuna.tsinghua.edu.cn/ubuntu/ groovy-updates main restricted universe multiverse" >>/etc/apt/sources.list echo "deb http://mirrors.tuna.tsinghua.edu.cn/ubuntu/ groovy-backports main restricted universe multiverse" >>/etc/apt/sources.list echo "deb http://mirrors.tuna.tsinghua.edu.cn/ubuntu/ groovy-security main restricted universe multiverse" >>/etc/apt/sources.list apt-get update apt-get install -y net-tools vim ``` ## 重新运行KubeEdge **Cloud:** 注册设备,换kind集群以后,如果不重新注册设备,cloudcore就一堆报错 ``` cd $GOPATH/src/github.com/kubeedge/kubeedge/build/crds/devices kubectl apply -f ./devices_v1alpha2_device.yaml kubectl apply -f ./devices_v1alpha2_devicemodel.yaml cd $GOPATH/src/github.com/kubeedge/kubeedge/build/crds/reliablesyncs kubectl apply -f objectsync_v1alpha1.yaml kubectl apply -f cluster_objectsync_v1alpha1.yaml ``` ``` cloudcore ``` **Edge:** ``` edgecore ``` ## 部署计数器应用 **Cloud:** 配置k8s,把k8s的HTTP端口暴露出来 ``` # 进入kind-control-plane容器 docker exec -it kind-control-plane /bin/bash # 修改api-server的配置文件 cd /etc/kubernetes/manifests vim kube-apiserver.yaml # 1. 将里面的`- --insecure-port=0`的`0`改成`8080`。 # 2. 添加`--insecure-bind-address=0.0.0.0` # 重启 exit docker restart kind-control-plane ``` 部署计数器应用 ``` cd $GOPATH/src/github.com/kubeedge/examples/kubeedge-counter-demo/crds # 注册设备 kubectl create -f kubeedge-counter-model.yaml kubectl create -f kubeedge-counter-instance.yaml # 删除旧的部署 kubectl delete deployment kubeedge-pi-counter kubectl delete deployment kubeedge-counter-app # 部署新的 kubectl create -f kubeedge-pi-counter-app.yaml kubectl create -f kubeedge-web-controller-app.yaml ``` 现在的问题是:两个Pod所在的节点IP都不对。 网页(kubeedge-counter-app)的Node是 kind-control-plane/172.18.0.2 计数器(kubeedge-pi-counter)的Node是 kubeedge/10.0.2.15 不知道是不是因为pods所在的Node-IP联不通,才导致cloudcore和edgecore的消息传不过来。 也不是完全传不过来:控制Edge生成Pods的消息是传过来了的,但发给Pods的消息传不过来。 > 查看Pod的日志:(先用`kubectl get pods`看pod全名) > > ``` > kubectl logs -f kubeedge-counter-app-576dc9fdc8-ngnsj > ``` > > 这是网页的Pod。 > > ``` > root@cloud:.../crds# kubectl logs -f kubeedge-counter-app-576dc9fdc8-ngnsj > 2020/12/07 02:30:11 Get kubeConfig successfully > 2020/12/07 02:30:11 Get crdClient successfully > 2020/12/07 02:30:11.783 [I] http server Running on http://:80 > 2020/12/07 02:30:49 Index Start > 2020/12/07 02:30:49 Index Finish > 2020/12/07 02:30:51 ControlTrack: ON > 2020/12/07 02:30:51 Failed to patch device status &{{[{status {ON map[timestamp:1607 type:string]} {0 map[timestamp:1607 type:string]}}]}} of device counter in namespace default > error:the server could not find the requested resource (patch devices.devices.kubeedge.io counter) > ``` 错误信息是:`the server could not find the requested resource (patch devices.devices.kubeedge.io counter)` 而手动查的话,是能找到`counter`这个device的。 > ``` > root@cloud:.../crds# kubectl get devices > NAME AGE > counter 177m > ``` 那么问题应该就在`devices.devices.kubeedge.io`了。在重建k8s mastser节点以后,运行的4个yaml文件,其中有个"devices_v1alpha2_device.yaml",里面有`devices.devices.kubeedge.io`。 > ```yaml > # $GOPATH/src/github.com/kubeedge/kubeedge/build/crds/devices/builddevices_v1alpha2_device.yaml > apiVersion: apiextensions.k8s.io/v1beta1 > kind: CustomResourceDefinition > metadata: > labels: > controller-tools.k8s.io: "1.0" > name: devices.devices.kubeedge.io > spec: > group: devices.kubeedge.io > names: > kind: Device > plural: devices > scope: Namespaced > ...后面的省略 > ``` > > 可以看到devices.devices.kubeedge.io的类型是CustomResourceDefinition。可以看一下同类的用什么: ``` kubectl get CustomResourceDefinitions ``` > ``` > root@cloud:.../reliablesyncs# kubectl get CustomResourceDefinitions > NAME CREATED AT > clusterobjectsyncs.reliablesyncs.kubeedge.io 2020-12-07T02:00:52Z > devicemodels.devices.kubeedge.io 2020-12-07T02:00:50Z > devices.devices.kubeedge.io 2020-12-07T02:00:50Z > objectsyncs.reliablesyncs.kubeedge.io 2020-12-07T02:00:50Z > ``` 看一下devices.devices.kubeedge.io的具体情况: ``` kubectl describe CustomResourceDefinitions devices.devices.kubeedge.io ``` > devices.devices.kubeedge.io的描述太长了,就只截三处我觉得可能有用的 > > ``` > Name: devices.devices.kubeedge.io > Namespace: > Labels: controller-tools.k8s.io=1.0 > Annotations: API Version: apiextensions.k8s.io/v1 > Kind: CustomResourceDefinition > Metadata: > Creation Timestamp: 2020-12-07T02:00:50Z > Generation: 1 > Managed Fields: > API Version: apiextensions.k8s.io/v1 > Fields Type: FieldsV1 > fieldsV1: > f:status: > f:acceptedNames: > f:kind: > f:listKind: > f:plural: > f:singular: > f:conditions: > Manager: kube-apiserver > Operation: Update > Time: 2020-12-07T02:00:50Z > API Version: apiextensions.k8s.io/v1beta1 > > ... > > Manager: kubectl > Operation: Update > Time: 2020-12-07T02:00:50Z > Resource Version: 566 > Self Link: /apis/apiextensions.k8s.io/v1/customresourcedefinitions/devices.devices.kubeedge.io > UID: f2006fa2-40a0-4e34-b368-d01932f01207 > Spec: > Conversion: > Strategy: None > Group: devices.kubeedge.io > Names: > Kind: Device > List Kind: DeviceList > Plural: devices > Singular: device > Preserve Unknown Fields: true > Scope: Namespace > > ... > > Status: > Accepted Names: > Kind: Device > List Kind: DeviceList > Plural: devices > Singular: device > Conditions: > Last Transition Time: 2020-12-07T02:00:50Z > Message: spec.versions[0].schema.openAPIV3Schema.type: Required value: must not be empty at the root > Reason: Violations > Status: True > Type: NonStructuralSchema > Last Transition Time: 2020-12-07T02:00:50Z > Message: no conflicts found > Reason: NoConflicts > Status: True > Type: NamesAccepted > Last Transition Time: 2020-12-07T02:00:50Z > Message: the initial names have been accepted > Reason: InitialNamesAccepted > Status: True > Type: Established > Stored Versions: > v1alpha2 > Events: > ``` 其中有一个Self link` Self Link: /apis/apiextensions.k8s.io/v1/customresourcedefinitions/devices.devices.kubeedge.io` 而```kubectl describe device counter` counter设备的描述中,Self link是`Self Link: /apis/devices.kubeedge.io/v1alpha2/namespaces/default/devices/counter` 下一步是看看"Failed to patch device status &{{[{status {ON map[timestamp:1607 type:string]} {0 map[timestamp:1607 type:string]}}]}} of device counter in namespace default error:the server could not find the requested resource (patch devices.devices.kubeedge.io counter)"这一步,找的是那个链接。 在kubeedge项目里搜索“Failed to patch device status”,在cloud\pkg\devicecontroller\controller\upstream.go里找到1条结果 > ``` > result := uc.crdClient.Patch(MergePatchType).Namespace(cacheDevice.Namespace).Resource(ResourceTypeDevices).Name(deviceID).Body(body).Do(context.Background()) > if result.Error() != nil { > klog.Errorf("Failed to patch device status %v of device %v in namespace %v", deviceStatus, deviceID, cacheDevice.Namespace) > continue > } > ``` > > 其中的Patch是rest.RESTClient的,rest.RESTClient用来发送一个HTTP请求,可以是GET, POST, PUT, DELETE, PATCH。这段代码是发送一个HTTP_PATCH请求,Header的Content-Type字段是变量MergePatchType,值是“application/merge-patch+json” > > uc.crdClient.Patch(MergePatchType) 返回的是NewRequest类型,这个类型是在:\vendor\k8s.io\client-go\rest\request.go里重载过的。 > `uc.crdClient.Patch(MergePatchType).Namespace(cacheDevice.Namespace)`这一步没错,因为报错没有` the server could not find` 在kubeedge项目里搜索“the server could not find”,在vendor\k8s.io\apimachinery\pkg\api\errors\errors.go 里找到1条结果。 > ``` > func NewGenericServerResponse(code int, verb string, qualifiedResource schema.GroupResource, name, serverMessage string, retryAfterSeconds int, isUnexpectedResponse bool) *StatusError { > switch code { > case http.StatusNotFound: > reason = metav1.StatusReasonNotFound > message = "the server could not find the requested resource" > ...}} > ``` > > 其中: > > - code是`http.StatusNodFound` > - verb是`patch` > - qualifiedResource(schema.GroupResource类型)是`devices.devices.kubeedge.io` > - name是`counter` **小结:** 报错的是下面这行代码: ``` uc.crdClient.Patch(MergePatchType).Namespace(cacheDevice.Namespace).Resource(ResourceTypeDevices).Name(deviceID).Body(body).Do(context.Background()) ``` 其中`uc.crdClient.Patch(MergePatchType).Namespace(cacheDevice.Namespace)`的过程没报错,就是设置一下这个Request的namespace属性,这个返回的仍然是Request类型。 `xxx.Resource(ResourceTypeDevices)`也没报错,是把前面Request的Resource属性,设成ResourceTypeDevices的值。返回的仍是Request类型。 `xxx.Name(deviceID)` 也没报错,是把前面Request的Name属性,设成deviceID的值。返回的仍是Request类型。 `xxx.Body(body)` 是设置Request的Body,可以把obj设置成body。这一步应该没问题,问题是找不到设备。 `xxx.Do(context.Background())` 执行`context.Background()`,可惜不知道`context.Background()`里是什么内容。Do把HTTP Request发出去以后,会用`transformResponse(resp, req)`把HTTP的响应 转换为Result类型(object类型),在这个转化过程中如果失败了,会重试;重试过还失败,就会报错,也就是`NewGenericServerResponse`的"the server could not find the requested resource" **下一步**应该是改一下cloudcore的代码,显示它报错的Request是发往哪个URL。 ### 修改Cloudcore ``` # 备份一下原来的 cd $GOPATH/src/github.com/kubeedge/kubeedge/cloud/pkg/devicecontroller/controller cp upstream.go upstream.go-bak cp /usr/local/bin/cloudcore /usr/local/bin/cloudcore-bak ``` 开始改: ``` cd $GOPATH/src/github.com/kubeedge/kubeedge/cloud/pkg/devicecontroller/controller vim upstream.go ``` 按`/Failed to patch`搜索报错的位置 ``` # 最前面添加引用 fmt # 找到这一行代码: klog.Errorf("Failed to patch device status %v of device %v in namespace %v", deviceStatus, deviceID, cacheDevice.Namespace) # 在下面加几行: req:=uc.crdClient.Patch(MergePatchType).Namespace(cacheDevice.Namespace).Resource(ResourceTypeDevices).Name(deviceID).Body(body) klog.Warningf("Failed Request is %v",req) klog.Warningf("Failed Do command is %v",context.Background()) ``` #### 编译运行 ``` cd $GOPATH/src/github.com/kubeedge/kubeedge/ make all WHAT=cloudcore cp _output/local/bin/cloudcore /usr/local/bin/cloudcore ``` > 想看报错信息,但新加的报错信息不这显示 > > ``` > kubectl logs -f kubeedge-counter-app-576dc9fdc8-l6zjd > ``` 新发现,修改的那几行,没执行到。准确的说,函数执行到了,但提前跳出了。 有个问题,代码里的klog.Warningf、klog.Errorf不显示,也不知道输出到哪。 ``` :%s#klog.*("#fmt.Printf("\\nklog.:#g :%s#fmt.Println("klog.#fmt.Printf("klog.:#g ``` 这下能看清楚了。 > ``` > klog.*:Start upstream devicecontroller > klog.*:Dispatch message: %s 3517d422-c6c4-4e95-b6a5-3c471458a00f > klog.*:Parse message: %s resource type with error: %s 3517d422-c6c4-4e95-b6a5-3c471458a00f unknown resource > klog.*:Dispatch message: %s 3517d422-c6c4-4e95-b6a5-3c471458a00f > klog.*:Parse message: %s resource type with error: %s 3517d422-c6c4-4e95-b6a5-3c471458a00f unknown resource > ``` > > 重新调了一下输出: > > ``` > klog.:Start upstream devicecontroller > klog.:Dispatch message: 5dbdd991-5cb3-46e2-8e86-3f66f5ffe3d8 > klog.:Parse message: 5dbdd991-5cb3-46e2-8e86-3f66f5ffe3d8 resource type with error: unknown resource > msg= {{5dbdd991-5cb3-46e2-8e86-3f66f5ffe3d8 1607392087250 false} {twin resource get node/kubeedge/membership/detail} {"event_type":"group_membership_event","event_id":"123","group_id":"kubeedge","operation":"detail","timestamp":1607392087250}} > msg.GetResource()= node/kubeedge/membership/detail > ``` > > - 同步:false > - 消息来源:twin > - 发给组:resource > - 操作:get > - 资源:node/kubeedge/membership/detail > - 上下文:`{"event_type":"group_membership_event","event_id":"123","group_id":"kubeedge","operation":"detail","timestamp":1607392087250}` 也就是说,`messagelayer.GetResourceType("node/kubeedge/membership/detail")` 执行失败了,不认识“node/kubeedge/membership/detail”这个资源。 在\cloud\pkg\ **devicecontroller** \messagelayer\util.go 里有GetResouceType的实现。在**edgecontroller**里也有。 > ``` > func GetResourceType(resource string) (string, error) { > if strings.Contains(resource, deviceconstants.ResourceTypeTwinEdgeUpdated) { > return deviceconstants.ResourceTypeTwinEdgeUpdated, nil > } > return "", errors.New("unknown resource") > } > ``` ``` cd $GOPATH/src/github.com/kubeedge/kubeedge/cloud/pkg/devicecontroller/messagelayer vim util.go ``` > ``` > resource: node/kubeedge/membership > klog.:Start upstream devicecontroller > klog.:Dispatch message: 9a837dfd-ae37-4d03-a971-fa1752682209 > ResourceTypeTwinEdgeUpdated is twin/edge_updated > klog.:Parse message: 9a837dfd-ae37-4d03-a971-fa1752682209 resource type with error: unknown resource > msg= {{9a837dfd-ae37-4d03-a971-fa1752682209 1607393434185 false} {twin resource get node/kubeedge/membership/detail} {"event_type":"group_membership_event","event_id":"123","group_id":"kubeedge","operation":"detail","timestamp":1607393434185}} > msg.GetResource()= node/kubeedge/membership/detail > ``` > > ## 小结 现在的问题有: 1. twin 定时发到 resource 的消息,所携带的资源是 `node/kubeedge/membership/detail`,这个资源名,不含`twin/edge_updated`,就不认识,报 unknown resource 错误。 - 本来应该走EdgeController的,走了DeviceController。 2. 刚才改的代码,是显示**从Edge发来的上游消息**。从网页端控制按钮,发往Edge的消息,走的时候下游消息 > Upstream Controller: Sync watch and Update status of resource and events(node, pod and configmap) to K8s-Api-server and also subscribe message from edgecore 也就是说,下面有两个事: 1. 改一下EdgeController的代码,显示一下“DeviceController不认识的消息”有没有被EdgeController接收。 2. 分别改EdgeController和DeviceController里的DownStream的代码,看一下在网页点击按钮这个消息,有没有发出下游消息,从哪个控制器走的。 ## ~~改EdgeController~~ ``` cd $GOPATH/src/github.com/kubeedge/kubeedge/cloud/pkg/edgecontroller/controller vim upstream.go ``` 等会儿。。。EdgeController没有“Failed to patch”这个输出。 - DeviceController的upstream.go里有 patch、有 dispatch - EdgeController的upstream.go里全是 dispatch 看EdgeController的upstream.go的内容,是关于Node、Pod、Volume之类的状态。也没有关于twin、device的。 ## 改DownStream ``` cd $GOPATH/src/github.com/kubeedge/kubeedge/cloud/pkg/devicecontroller/controller vim downstream.go ``` downstream.go先后走两步:注册设备(syncDeviceModel),同步(syncDevice)。 看代码内容,感觉好像也找错了,网页按钮发送的消息估计不走这个。不过可以看看它的输出。 ``` :%s#klog.*("#fmt.Printf("\\nklog.:#g ``` ``` cd $GOPATH/src/github.com/kubeedge/kubeedge/ make all WHAT=cloudcore cp _output/local/bin/cloudcore /usr/local/bin/cloudcore ``` > ``` > klog.:Start downstream devicecontroller > klog.:Device doesn't support valid protocol > resource: node/kubeedge/membership > klog.:Start upstream devicecontroller > ``` > > 出现了一个Warning:Device doesn't support valid protocol > > device.Spec.Protocol后面如果没有OpcUA、Modbus、Bluetooth、CustomizedProtocol中的任何一个,就会出现这个Warning,但不会跳出,而是继续往下走,把设备实例和设备用的协议加到Profile里。设备实例输出出来是: > > ``` > Device Instance: &{counter counter counter-model [{status {OFF map[type:string]} {0 map[type:string]}}] [] []} > ``` > > 网页点击按钮,什么都不会输出。syncDevice收不到网页端操作的“修改设备状态”的事件。 > > ``` > [SyncDevice] device is: &{{Device devices.kubeedge.io/v1alpha2} {counter default /apis/devices.kubeedge.io/v1alpha2/namespaces/default/devices/counter 62498710-d4f4-47fd-8670-e12b6a9885bd 1545 1 2020-12-07 11:44:02 +0000 UTC map[description:counter manufacturer:test] map[] [] [] [{kubectl Update devices.kubeedge.io/v1alpha2 2020-12-07 11:44:02 +0000 UTC FieldsV1 {"f:metadata":{"f:labels":{".":{},"f:description":{},"f:manufacturer":{}}},"f:spec":{".":{},"f:deviceModelRef":{".":{},"f:name":{}},"f:nodeSelector":{".":{},"f:nodeSelectorTerms":{}}},"f:status":{".":{},"f:twins":{}}}}]} {&LocalObjectReference{Name:counter-model,} { } [] {[] } &NodeSelector{NodeSelectorTerms:[]NodeSelectorTerm{NodeSelectorTerm{MatchExpressions:[]NodeSelectorRequirement{NodeSelectorRequirement{Key:,Operator:In,Values:[kubeedge],},},MatchFields:[]NodeSelectorRequirement{},},},}} {[{status {OFF map[type:string]} {0 map[type:string]}}]}} > ``` > > ## 找到问题了。 问题在`$GOPATH/src/github.com/kubeedge/examples/kubeedge-counter-demo/web-controller-app/controller/trackController.go`,里面关于更新设备状态那段,引用的是一个不存在的设备协议API,引用的是 > ``` > github.com/kubeedge/kubeedge/cloud/pkg/devicecontroller/apis/devices/v1alpha1 > ``` > > 或许旧版的KubeEdge里有这个文件夹,但新版这个没有。新版的是 > > ``` > $GOPATH/src/github.com/kubeedge/kubeedge/cloud/pkg/apis/devices/v1alpha2 > ``` > > 删除网页的部署: ``` kubectl delete deployment kubeedge-counter-app ``` 重新编译网页端: ``` cd $GOPATH/src/github.com/kubeedge/examples/kubeedge-counter-demo/web-controller-app go build -o kubeedge-counter-controller main.go ``` 知道为什么引用“不存在”的配置文件但不报错了: > import "github.com/kubeedge/kubeedge/cloud/pkg/devicecontroller/apis/devices/v1alpha1"的时候,是从./vendor里找,./vendor里面有这个配置文件。 ### ~~cloudcore用v1alpha1设备协议~~ 问题是v1alpha1的配置文件有点老。先试试cloudcore能不能用v1alpha1。 ``` cd $GOPATH/src/github.com/kubeedge/kubeedge/build/crds/devices kubectl apply -f ./devices_v1alpha1_device.yaml kubectl apply -f ./devices_v1alpha1_devicemodel.yaml ``` > 报错: > > ``` > The CustomResourceDefinition "devicemodels.devices.kubeedge.io" is invalid: spec.version: Invalid value: "v1alpha1": must match the first version in spec.versions > ``` > > 要先卸载掉v1alpha2才能用v1alpha1。 ``` kubectl delete CustomResourceDefinition devicemodels.devices.kubeedge.io kubectl delete CustomResourceDefinition devices.devices.kubeedge.io ``` 删除完这两个,就可以用上面的两个v1alpha1了。或许问题一开始就出在这。重新部署一次试试。 ``` cd $GOPATH/src/github.com/kubeedge/examples/kubeedge-counter-demo/crds kubectl create -f kubeedge-web-controller-app.yaml ``` cloudcore一堆关于v1alpha2的报错。新版kubeedge不能用v1alpha1。 复原到之前状态: ``` kubectl delete CustomResourceDefinition devicemodels.devices.kubeedge.io kubectl delete CustomResourceDefinition devices.devices.kubeedge.io cd $GOPATH/src/github.com/kubeedge/kubeedge/build/crds/devices kubectl apply -f ./devices_v1alpha2_device.yaml kubectl apply -f ./devices_v1alpha2_devicemodel.yaml cd $GOPATH/src/github.com/kubeedge/examples/kubeedge-counter-demo/crds kubectl create -f kubeedge-counter-model.yaml kubectl create -f kubeedge-counter-instance.yaml ``` ### 网页App改用v1alpha2协议 1. 备份一下原来的 ``` cp -r $GOPATH/src/github.com/kubeedge/kubeedge/vendor $GOPATH/src/github.com/kubeedge/examples/kubeedge-counter-demo/ cp -r $GOPATH/src/github.com/kubeedge/kubeedge $GOPATH/src/github.com/kubeedge/examples/kubeedge-counter-demo/vendor/github.com/ cd $GOPATH/src/github.com/kubeedge/examples/kubeedge-counter-demo/web-controller-app/controller cp trackController.go trackController.go-bak ``` 2. 改代码 ``` cd $GOPATH/src/github.com/kubeedge/examples/kubeedge-counter-demo/web-controller-app/controller vim trackController.go cd $GOPATH/src/github.com/kubeedge/examples/kubeedge-counter-demo/web-controller-app/utils vim crdclient.go ``` - 把import里的 xxx.v1alpha1那行注释掉,下面添一行: ``` "github.com/kubeedge/cloud/pkg/apis/devices/v1alpha2" ``` - 把所有的v1alpha1改成v1alpha2:`ESC` ,`:%s#v1alpha1#v1alpha2#g` 3. 打包Docker镜像(版本号往后加) ``` cd $GOPATH/src/github.com/kubeedge/examples/kubeedge-counter-demo/web-controller-app make docker build . -t pro1515151515/kubeedge-counter-app:v1.0.9 #docker login --username=pro1515151515 registry.cn-hangzhou.aliyuncs.com docker push pro1515151515/kubeedge-counter-app:v1.0.9 ``` 4. 删除旧部署 ``` kubectl delete deployment kubeedge-counter-app ``` 5. 更新CRD文件 ``` cd $GOPATH/src/github.com/kubeedge/examples/kubeedge-counter-demo/crds vim kubeedge-web-controller-app.yaml ``` 改一下版本号 6. 部署 ``` kubectl create -f kubeedge-web-controller-app.yaml ``` 7. 查看部署结果 ``` kubectl get pods kubectl describe pod kubeedge-counter-app ``` 8. 看Pod的日志 ``` kubectl logs -f kubeedge-counter-app-5847b4c4ff-b5gv5 ``` > ``` > 2020/12/08 12:21:30 Get kubeConfig successfully > 2020/12/08 12:21:30 Get crdClient successfully > 2020/12/08 12:21:30.391 [I] http server Running on http://:80 > 2020/12/08 12:22:07 Index Start > 2020/12/08 12:22:07 Index Finish > 2020/12/08 12:22:30 ControlTrack: ON > 2020/12/08 12:22:30 Failed to patch device status &{{[{status {ON map[timestamp:1607 type:string]} {0 map[timestamp:1607 type:string]}}]}} of device counter in namespace default > error:the server could not find the requested resource (patch devices.devices.kubeedge.io counter) > ``` 和之前一模一样。 刚想起来,让Edge连接k8s集群是为了看"kubeedge-pi-counter"这个Pod的日志 ``` kubectl logs -f kubeedge-pi-counter-85b7c977b-9vb9l ``` > ``` > root@cloud:.../controller# kubectl logs -f kubeedge-pi-counter-85b7c977b-9vb9l > Error from server: Get "https://192.168.56.101:10350/containerLogs/default/kubeedge-pi-counter-85b7c977b-9vb9l/kubeedge-pi-counter?follow=true": dial tcp 192.168.56.101:10350: connect: connection refused > ``` > > Cloud(192.168.56.103)的10350端口没在用。 > Edge(192.168.56.101)的10350端口是edgecore在用,不过是内网。 > > ``` > root@kubeedge:~# netstat -anp |grep 10350 > tcp 0 0 127.0.0.1:10350 0.0.0.0:* LISTEN 2884/edgecore > ``` > > 可以通过curl看日志: > > ``` > curl http://127.0.0.1:10350/containerLogs/default/kubeedge-pi-counter-85b7c977b-9vb9l/kubeedge-pi-counter?follow=true > ``` > > 但日志里什么都没有。 明天要用"github.com/kubeedge/kubeedge/cloud/pkg/apis/devices/v1alpha2/register.go"中的Resource()看看devices.devices.kubeedge.io的URL是哪。 ---------- 备份: ``` cd $GOPATH/src/github.com/kubeedge/kubeedge/cloud/pkg/apis/devices/v1alpha2/ cp register.go register.go-bak ``` 改register.go ``` cd $GOPATH/src/github.com/kubeedge/kubeedge/cloud/pkg/apis/devices/v1alpha2/ vim register.go ``` 运行 ``` cd $GOPATH/src/github.com/kubeedge/examples/kubeedge-counter-demo/web-controller-app go run main.go ``` 还是不行。。Web-APP只用到了register.go的init(),显示不了什么东西。register.go的其他部分,估计是cloudcore用到的、 下面接着找Web-APP的Controler是怎么找devices.devices.kubeedge.io的。 ~~但还有一个因素不能忽略:可能是Edge端的counter这个device用的是v1alpha1的协议,所以Web-APP死活找不到counter。~~ ~~等会儿。。。Edge端的counter用的是MQTT协议,不是v1alpha1和v1alpha2。~~ ~~找到了,kubeedge-counter-demo/web-controller-app/utils/crdclient.go用的还是v1alpha1。~~ ``` cd $GOPATH/src/github.com/kubeedge/examples/kubeedge-counter-demo/web-controller-app/controller vim trackController.go ``` ~~把 import 里的 kubeedge-counter-demo/web-controller-app/utils/ 换掉。~~ ``` github.com/kubeedge/kubeedge/cloud/pkg/devicecontroller/utils ``` ~~然后需要把一些Vendor移过来~~ 生成NewRESTClient用的配置文件是这样的: > ``` > {http://192.168.56.103:8080 { application/vnd.kubernetes.protobuf } { [] map[]} {false [] [] []} 5 10 0s } > ``` > > - baseURL:http://192.168.56.103:8080 > - trackController.go出问题的HTTP Patch请求,是这样的: > ``` > &{0xc0001cf230 PATCH 0xc0000c8980 { application/json 0x1657840 {{0xc000120c80 [{[application/json] application/json [json] true 0xc000120ec0 0xc000120f00 [] {} 0xc000120ec0} {[application/yaml] application/yaml [yaml] true 0xc000120f40 [] } {[application/vnd.kubernetes.protobuf] application/vnd.kubernetes.protobuf [pb] false 0xc00007d590 [] {} 0xc0001cf0e0}] 0xc0001c15c0 [{application/json true 0xc000120ec0 0xc000120f00 0xc0001cf140} {application/yaml true 0xc000120f40 } {application/vnd.kubernetes.protobuf false 0xc00007d590 0xc0001cf170}] 0xc000120ec0}}} {{{devices.kubeedge.io v1alpha1} 0xc000120ec0 0xc000120c80} {0xc000120ec0} 0xc000120ec0 {} 0xc6f520} /apis/devices.kubeedge.io/v1alpha1 map[] map[Accept:[application/json, */*] Content-Type:[application/merge-patch+json]] default true devices counter 0 0xc000317dd0 0x16962a8 0xc0001c1680} > ``` > > 标签: none 本作品采用 知识共享署名-相同方式共享 4.0 国际许可协议 进行许可。
更新 kubeedge-counter-demo/vendor/github.com/kubeedge版本到v1.5.0
您好,我也遇到这个问题了,您这个问题处理好了吗
我解决不了。
例程用的是v1alpha1版本的Device协议,而KubeEdge v1.5.0只支持v1alpha2的。
我最后是改用v1.3.1版本的KubeEdge,把这个问题绕开了。