Skip to main content

Posts

Showing posts from 2015

vRealize Opertion Manager 6之NetApp存储性能监控组件介绍

vRealize Operation Manager 6 (又叫vROps)是vCenter Operation Manager的全新版本,我从vCenter Operation Manager还是1.0时就开始使用了,很喜欢自我学习和动态阀值这两个功能。但是这款产品只能监控虚拟层面,如果可以监控存储层面就完美了。在比较大的vSphere环境中虚拟机是共享ESXi数据存储(datastore)的,如果少数虚拟机产生很高的IO,可能会影响到其他处于同一个存储上的虚拟机。想象一下,如果你有100个LUN跑在一个NetApp存储上,300个虚拟机在使用这100个LUN,某日用户说他们的虚拟机很慢,但是他们并没有跑什么应用,这时候就会比较难判断到底是哪儿出了问题,因为虚拟机可能共享同一个数据存储(datastore),数据存储存建于LUN上,LUN 可能来自某个聚合(Aggregate),并且多个LUN可能来自同一个物理磁盘。vCenter Operation Manager 在5.x时代有提供一款NetApp存储监控组件,但问题是很难把vSphere的数据存储(Datastore)和NetApp存储的设备关联起来。

NetApp Management Package for vRealize Operation Manager 6

vRealize Operation Manager 6 (aka vROps) is new generation of vCenter Operation Manager. I started to use vCenter Operation Manager since version 1.0. I like the idea of self-learning and dynamic threshold. But the product only monitors virtualization layer. It would be perfect if it's able to monitor under layer storage. In large vSphere environment, virtual machines share IO capacity of datastores. If few virtual machines running high disk IO it may lead to other virtual machines get performance degrading in same storage. Think about you have 100 datastores come from a NetApp filer, and 300 virtual machines running on its. One user says their virtual machine is slow but no workload from applications end. It hard to say where the latency comes from because multiple virtual machines may share same datastore, multiple LUNs share same aggregate, and maybe same physical disks. vCenter Operation Manager provided NetApp Adapter for 5.x few years ago. But the problem was it's too har

vRealize Automation 7 初始设置

vRealize Automation 7 (vRA 7)和vRA6比起来有很多增强和改进。网上有大量的文章介绍这方面以及安装方法。vRA7的初始设置和vRA6有很大不同。以下是我的一些经验,可以帮你快速搭建实验环境。

Initial Configuration of vRealize Automation 7

vRealize Automation 7 (vRA 7) has lot of enhancements and changes compare with vRA 6. There are plenty of introductions available in internet. The initial configuration is different with vRA 6. I'm going to share my experience. You can easily build up LAB or POC by following this post.

Inventory Service无法启动

某日,vCenter Server突然无法搜索虚拟机了。在vSphere Client中搜索时会提示 Unable to connect to web services to execute query. Verify that the 'VMware VirtualCenter Management Webservices' service is running on https://vCenter_Server_FQDN:10443 。没过几个小时用户就开始抱怨vSphere Web Client也出问题了,总是提示错误 Client is not authenticated to VMware Inventory Service - https://Inventory_Service_FQDN:10443 。

如何在vRA中在指定的OU创建虚拟机

在管理企业级活动目录时最好根据某个特殊属性来组织服务器。比如,服务器可以根据角色、部门、功能等放在不同的OU中。以下是一个vRO工作流例子,可以实现根据vRA中用户的选择把服务器创建在不同的OU中。以下是大概的步骤,不是特别详细,在读之前最好对vRO和vRA有所了解。

Create VM on specified OU on vRA

Best practices to manage enterprise Active Directory is organizing servers by particular properties.  For example, servers maybe put into different OU by role, business group or function...etc. Following is a vRO workflow sample to automate provisioning computers in proper OUs according to user choice in vRA Service Catalog. I'll just give brief of each step in this article, so please make sure you understand both products before read this post.

在vCO中将字符串转为对象

当创建虚拟机的时候你可能需要将虚拟机根据不同的属性放入不同的OU中,比如根据角色、组、用户组等。在vRealize Automation Center (vRA)中可以很轻易地创建一个下拉菜单实现这类属性的选择,但是这类属性的值往往都以字符串的形式传递到vRO中,而vRO的活动目录工作流中并没有提供字符串转OU对象的功能。

Convert string to OU object in vRO

When you put virtual machine to particular OU, you may refer to virtual machine properties, such as 'server role', 'server group' or 'user group'...etc. It's easy to set a drop-list in blueprint of vRealize Automation Center (vRA) to let users choose this kind of properties but hard to create a computer account in corresponded OU location in vRO. That's because vRA passes most of values to vRO as strings, Active Directory workflows in vRO do not provide a way to convert string to OU.

Inventory Service Cannot be Brought Up

One day, my vCenter Server suddenly lost search. It popped me " Unable to connect to web services to execute query. Verify that the 'VMware VirtualCenter Management Webservices' service is running on https://vCenter_Server_FQDN:10443 " when I did object search on vSphere Client. Few hours later people starting complaint they got error on vSphere Web Client, it show " Client is not authenticated to VMware Inventory Service - https://Inventory_Service_FQDN:10443 ".

新建的Super Metric不显示

今天在vRealize Operation Manager 6.0创建了几个super metric,主要用来计算ESXi主机的物理链路吞吐量。结果发现这些super metric只是出现在部分主机里。估计是有什么bug。快速解决的办法是重启一下vROps vApp。

New Created Super Metric Doesn't Appear for Objects

Today I created few super metrics on vRealize Operation Manager 6.0 to calculate throughput of physical links on ESXi host. The super metrics just present to part of the selected hosts. I guess it's some kind of minor bug. A reboot of vROps vApp can works around it. Just heads up.

Windows Server 2016 技术预览3内核模式 - 远程管理技巧

微软刚刚发布了Windows Server 2016的技术预览3。新版本中有很多增强,看起来微软的软件定义的数据中心正在赶上VMware。一个稳定的虚拟层是软件定义数据中心的前提,但这是微 软的软肋。你不得不不停地打各种补丁和重启服务器,甚至有些企业有定期的重启计划。微软在Windows Server 2008 的时候引入了核心模式并且在Windows Server 2012 R2中得到增强。但是Windows Server 2012 R2瞄准的是中小企业市场,我不认为他们会使用核心模式,因为复杂度要提升很多。

How to call customized vRealize Orchestrator workflows in vRealize Automation Center

You almost can do everything as long as vRealize Automation Center (aka vRA) and vRealize Orchestrator (aka vRO) are integrated. I think that's the hard part if you are newbie like me. After reading lot of articles, I learned how it works. Following is my experience, please let me know if you see anything wrong. 把vRA和vRO结合在一起几乎可以做任何事情。如果你和我一样是新手,和uijuede 得整合这块比较难,最近阅读了一些这方面的文章,算是有所了解了。以下是我的见解,如果有问题留言给我。

Core mode of Windows Server 2016 TP3 - Remote Management Tip

Microsoft just released technical preview 3 of Windows Server 2016, it's catching up VMware on SDDC. I can see a lot of enhancement in the new version. A stable hypervisor is  prerequisite of SDDC but it's weakness of Microsoft. You have to patch and reboot frequently, some organizations even have regular reboot schedules. Microsoft introduced core mode on Windows Server 2008, it much enhanced on Windows Server 2012 R2. But Windows Server 2012 R2 aims to SBM. I didn't think SBM organizations really need that if you compare operation complexity of core mode with GUI.

PCPU locked up on Cisco UCS

[caption id="attachment_569" align="alignnone" width="300"] Error message of the PSOD[/caption] ESXi 5.5 Update 2 is stable version, but I got PSOD on one UCS blade few days ago. It scared me since there was a big bug when I upgraded ESXi from 5.1 to 5.5 Update 1 last year(See detail ESXi 5.5 and Emulex OneConnect 10Gb NIC ), it lead to dozen virtual  machines crashed over and over again.I bet I'm gonna to die if it happens again. :-) ESXi 5.5 Update 2 算得上比较稳定的版本了,但前几天遇到一台紫屏,差点儿吓尿了。半年前从ESXi 5.1升级到ESXi5.5 Update 1时候遇到个大BUG(详情见我的文章ESXi 5.5 and Emulex OneConnect 10Gb NIC),搞得几十台几十台机器挂,这次升级再来一次估计职业生涯就此结束了。

VMware critical notification alarm

The title looks scared, is it? Actually I don't want to talk about any problem of VMware product but just a feature. 标题看起来吓人吧?其实这次打算只谈谈功能。

How to Automate Snapshot on Virtual Machine

I always treat virtual machine snapshots like a big risk. It caused several outages in our infrastructure. Please check out Best practices for virtual machine snapshots in the VMware to understand how it impacts production. 虚拟机快照对我来说绝对是个大威胁,已经在我的生产环境里发生过好几次由此引发的故障了。如果你要了解快照对生产环境的影响可以看看: Best practices for virtual machine snapshots in the VMware

Hack D-Link Wifi Router

Someone setup a non-secure wifi around my apartment, I never connected it till yesterday since I worried it's may be a honeypot. I had some me time yesterday night, so I setup a virtual machine to connect the wifi.

How to integrate PowerCLI with PowerShell and PowerShell ISE

I wrote a post about how to integrate PowerCLI with PowerShell manually . I rebuilt my computer few days ago, need to integrate PowerCLI again. I used to scripting by PowerGUI , but something always lead to PowerGUI lost menu, it frustrated me a long time. I cannot figured out what's the root cause. So I wondered is it possible use PowerShell ISE instead of PowerGUI?

How to identify datastore alignment

I just found an article show how to check alignment of Windows virtual machine and datastore. 刚找到一篇关于 如何查看Windows虚拟机和ESXi datastore是否和存储扇区一致 的文章。

CustomAction VM_InstallJRE returned actual error code 1624

vCenter Server 5.5 Update 2e contains fix of  Storage Monitor Service . It's also a stable version since 5.5 Update 1. I got a problem when I upgraded my development vCenter Server last weekend. I'd like to share the solution since VMware doesn't document that problem. (Maybe I didn't find it. :-)) It's kind tricky. vCenter Server 5.5 Update 2e包含SMS服务的bug修复,它也是当前比较稳定的版本。上周我在升级vCenter Server到这个版本时遇到了一个问题。此问题不是那么容易修复因为VMware的KB并没有提供解决方案,我在这里把我的方法共享出来。

Transparent Page Sharing (TPS) is disabled by default in latest ESXi 5.5 patch

I just heared Transparent Page Sharing (TPS) is disabled by default in latest ESXi 5.5 patch. You may concern about that if your IT budget is tight since it means you need more memory for heavy virtual machines. 听说ESXi 5.5最新的patch里把TPS禁用了,专门研究了一下。觉得这对IT预算紧张的企业可能是个坏消息,因为这意味着你需要更多的内存应付大型虚拟机。

A very interesting Microsoft cluster failure

It's been a long time sine last post. I was out of internet due to health issue. Just got recovered and backed to normal work. I have to publish my article by English then translate it to Chinese later since I lost lot of me time after my baby born, but more fun. hopefully it not impact to Google search. :-) There was a interesting problem happend on Microsoft cluster when I came back from hospital. Our DBA team complaint Microsoft Cluster Service failed intermittently on virtual machine. This situation constantly happend for a week. At the beginning of the whole troubleshooting, team noticed quorum disks failed with following Windows event: Cluster service failed to update the cluster configuration data on the witness resource. Please ensure that the witness resource is online and accessible. So we focused on disk performance. vbod.log also show some performance degrading but the time was not match. Microsoft was involved after that, they said the cluster failure actually caused b

How to find corresponded physical disk for Hyper-V CSV volumes

CSV (Cluster Shared Volume) is fundamental of Microsoft Hyper-V. You must have it to leverage Live Migration and High Availability features. But it's very confuse when you want to reclaim CSV since CSV is using different name with physical disks. For example, CSV name usually is "Cluster Disk x" , path usually is "C:ClusterStorageVolumeX" . But real disk name is "Disk x" in Disk Manager . You have to very carefully when delete the disk.

How to configure vCAC 6.2 LAB on VMware Workstation 11 – Part 2

vCenter Server Configuration We will do identity source and permission settings on vCenter Server.

How to configure vCAC 6.2 LAB on VMware Workstation 11 – Part 1

In previous articles I shared how to build vCAC 6.2 LAB, we created domain controller and DNS services on DC01.contoso.com , vCenter Server on VC01.contoso.com , 3 ESXi hosts on ESX01/02/03.contoso.com , vCAC server on vCAC.contoso.com , IaaS server of vCAC on IaaS.contoso.com and FreeNAS server on FreeNAS.contoso.com .

How to Build vCAC 6.2 LAB on VMware Workstation 11 – Part 1

Recently VMware released VMware vRealize Automation Center 6.2 (vCAC). I guess there will be a newer version along with vSphere 6.0. Be an ITPro you have to keep learning new stuff! I built a lab environment on my laptop for learning. I'm going to share my implementation experience below, it spent me dozen hours plus lot of documents reading. Initially I felt it's to complicate to deploy (That's looks like a tradition of VMware products). But eventually I thought it's not easy to provide a unified self-service end user interface in a multi-vendors infrastructure. Even OpenStack is not easy!

ESXi 5.5 and Emulex OneConnect 10Gb NIC

*** English Version *** You are using HP ProLiant BL460c G7 or Gen8, ESXi version is 5.5, NIC is Emulex chipset. You are using driver version 10.x.x.x. You may experience the host randomly lost connectivity on vCenter Server, host status show "No responding". You cannot ping any virtual machine hosted on the blade. High pause frame is observed on HP virtual connect model down links after problem occurred. And you see similar error in vmkernel logs:

How to Change SCSI Controller Type on Virtual Machine

Some of my virtual machines used ISL logical SCSI controller. It's not recommended for Red Hat 6 virtual machines. We need to change it to VMware Paravirtual SCSI controller. Basically the steps is power off virtual machine, change the SCSI controller type, and power on. Then you lost operation system. :-)

Incompatible device backing specified for device 'x'

It's easy to find a solution for this particular problem. VMware has a KB for this error. Somehow it's not my case. I don't know what's the exactly root cause but you can try vMotion the virtual machine to other host and give a try. Chinese Version 这个问题的解决方案很容易找到,VMware有一个知识库。但不知道为什么,我遇到的问题没法用此 知识库 解决。我通过vMotion虚拟机到其他ESXi 主机解决此问题,也不知道具体原因是什么。

Linux virtual machine hang on ESXi 5.5 host

English Version Again something wrong on ESXi 5.5! Please don't upgrade VMware Tools to 5.5 if you have Debian or Red Hat Linux virtual machine on your ESXi 5.5 hosts. There is a unsolved bug on vmmemctl drivers (balloon driver) of VMware Tools 5.5 can lead to Linux virtual machine hangs.

The CPU has been disabled by the guest operating system

Few weeks ago, our database virtual machines got randomly failed. There was error message " The CPU has been disabled by the guest operating system. Power off or reset the virtual machine. " on VM events. I didn't find any abnormal on vmkernel, hostd and vm logs. Finally our Linux team identified it's a Linux kernel bug. Please refer to BUG at block/blk-core.c:NNNN! in blk_requeue_request or blk finish_request . The bug only present on RedHat 6 on ESXi 5.5 with VMware Paravirtual SCSI drivers.