大团队的持续集成建设路径
November 29th, 2009
- 状态:有缺陷的代码持续产出,软件最基本的功能无法保证
- 建立“缺陷发生时自动停线”的自働化机制
- 实施可视化管理:停线立即告警,作为最高优先级处理
- 解决基础质量问题:消除环境不稳定性,消除伪随机质量问题
- 提升团队基础能力:
- 采用令牌制提交,明确责任人,谁破坏谁修复
- 设立专人负责监控持续集成状态
- 对于不能快速解决的问题,预备有效的修改撤销机制,快速恢复生产,减少破坏的影响范围
- 状态:能得到可用基线,提交失败频率高
- 分层分级的配置管理和验证体系
- 验证提前:提交代码之前的准入构建
- 更严格的令牌制:以成功的准入构建报告申请令牌
- 状态:主线稳定可用,分支合入主线困难
- 标准化的分支持续集成环境
- 分支持续集成状态巡检,及时发现问题提供支持
- 帮助分支组培养持续集成专门人才
- 状态:持续集成稳定可用,需要持续提升
- 配置管理下的持续集成,解决了改进措施在大团队中复制的难题
- 迭代式改善的持续集成
- 从“贪多求快”、“一步到位”的建设思路转变,确立“小步走稳”的持续改进路线,将PDCA方法应用于大团队持续集成建设
自动化 vs. 自働化
November 29th, 2009
什么是自动化?是让生产线自动运行起来的技术吗?
那顶多也就是个传送带技术。
《图解丰田生产方式》 说,自働化,是在出现问题时让生产线自动停止。
自动停线的自働化,才能在现场、现物根据现实找到问题的根因,才能从源头上消除质量问题。只管自动运行不管自动停线的自动化,既无助于发现根因,又不能及时阻止不合格产品的生产,于是生产线末端的检验人员照样不能省。
建立真正的自働化,首先就要改变这个认识。全员意识到,自働化就意味着一旦缺陷发生立即停线,作为最高优先级处理。有了这样的觉悟才能继续前进。
意识之外,自働化的三个必要条件:
- 可视化。停线的同时以最直观的方式让所有人警觉,并作为最高优先级处理。
- 质量基础。如果设备经常发生异常故障,如果产品经常出现不明原因的质量问题,频繁的自动停线可能直接把生产线打倒。
- 能力基础。自动停线之后,团队是否具备快速发现问题、快速恢复生产的能力。
持续不能集成
November 1st, 2009
持续集成只是信息源。持续集成的检验是在代码提交之后而非之前进行的,因此持续集成的作用只是使项目健康情况可视化。并且这种可视化必须建立在构建经常成功的前提下。因为软件本身的复杂性决定了只有“是否达到质量要求”能够被简单度量,而“达到(或不达到)质量要求的程度”无法被简单度量。所以,如果构建经常失败,持续集成所能提供的信息就只剩“项目一直不健康”──这个信息的价值很小,如果不是完全没有价值的话。
让持续集成保持经常成功,必须规范三件事:
- 提交代码之前必须更新
- 提交代码之前必须进行本地验证
- 线上构建失败时不得进行任何提交(或更新)
如果做不到这三件事,持续集成就可能降格化为持续不能集成:你知道项目有问题,但不知道问题出在哪儿,也不知道该如何解决这些问题。
为了避免这种降格化,在持续集成到位之后,需要用更多的手段确保三点规范得以落实:
- 代码库和持续集成分级
- 责任逐级下压
- 建立提交前本地构建基础设施
要言之:持续不能集成会使持续集成的价值降低到约等于0(甚至低于0)。要避免持续不能集成的发生,第一要严格规范,第二要提供技术手段使规范能够被落实。
Cruise全绿了
June 30th, 2009
Recommendations From The Build Master
December 11th, 2008
It’s about multi-team configuration management (or continuous integration). It’s from The Build Master .
- Create the mainline (public) and virtual build labs (private) codelines.
- Make sure the mainline is pristine and always buildable and consumable. Create shippable bits on a daily basis. Use consistent, reliable builds.
- Build private branches in parallel with the main build at a frequency set by the CBT.
- Use consistent reverse and forward integration criteria across teams.
- Be aware that dev check-ins are normally made only into a private branch or tree, not the mainline.
- Know that check-ins into a private branch are only reverse integrated (RId) into main when stringent, division-wide criteria are met.
- Use atomic check-ins (RI) from private into main. Atomic means all or nothing. You can back out changes if needed.
- Make project teams accountable for their check-ins, and empower them to control their build process with help from the CBT.
- Configure the public/private source so that multisite or parallel development works.
- Optimize the source tree or branch structure so that you have ONLY ONE branch per component of your product.
从持续集成开始,你怕啥呢?
September 19th, 2008
鼠标说,很多人认为 敏捷从持续集成开始 ,于是就 有人害怕了 。
人家说,你们搞这贸贸然的就持续集成,不行的。
1、认清真正的敏捷通过学习和调研,认清什么是真正的敏捷。如果一些基本初始概念就错了,那么在后续的推行中必然会差之毫厘、谬之千里,真正的敏捷实施也就无从谈起。
2、预估推行敏捷的收益
在确认知道什么是真正敏捷的前提下,对自己团队、组织的现状、问题和薄弱环节进行评估,设定改进目标,并预估敏捷能否解决这些问题,带来潜在的改进。如果成功的把握小于 50% 或者现状已经足够好,就不要推行敏捷,何必劳民伤财呢。
有意思啊。弄不清什么是“真正的敏捷”,“现状已经足够好”,就不需要改进了?就不要解决问题了?
怕啥呢?怕的就是这持续集成它是实打实的东西吧。怕的就是持续集成一上,所有的问题都以测试是红是绿这么截然的方式暴露出来,逼着不得不解决吧。怕的就是持续集成它没有放之四海而皆准的套话可以讲,build脚本得一点点写出来吧。怕的就是没人跟您玩“学习和调研”了吧。怕的就是“真正的敏捷实施”卖不掉了吧。
作为一个专门帮人解决问题的,我越来越频繁的说,我不在乎敏捷不敏捷。你告诉我,你有什么问题,咱们一起来看看怎么解决。可能一个项目做到最后你还是不搞敏捷,没关系,我解决你的问题提高你的效率就行。
ThoughtWorks 是一个做实事的公司,所以我们认为每个项目应该做的第一件事就是让真正的软件开口说话,说出它有什么问题,然后我们来解决问题。难怪有人要怕呢。光顾着解决问题,那学习和调研基本初始概念的事可扔到哪儿去了呢?
怕是又动了谁的奶酪了吧。
You Break It, You Fix It
August 2nd, 2008
(From The Build Master )
This leads to one of the most important rules in the build lab: The build team never fixes build breaks, regardless of how trivial the break is. That’s the developer’s responsibility. We took this a step further: The developer who breaks the build has to go to his development machine, check out the file, fix it, and then go through all the check-in process steps again.
Build Breaks Always Have the Highest Priority for Everyone
This rule means that if you are a developer and you can fix the build break, and the developer who broke the build cannot be found, you should fix it immediately. Afterward, send an e-mail to the developer and the build team explaining what you did to fix the build, and remind your co-worker that he owes you a favor.
Microsoft's Build Lab
July 28th, 2008
(From The Build Master )
Because a build lab tends to have some downtime while the build team waits for compiles, links, and tests to finish, it should take advantage of these slow times to work on improvements to the build process. After the lab tests the improvements and confirms they are ready for primetime, it rolls out the changes. One way to deploy a new build process after a shipping cycle is to send a memo to the whole team pointing to an internal Web site that has directions on the new process that the Central Build Team will be using in future product builds.
Today, the Windows build lab has its own development team working on writing and maintaining new and old project tools. The development team also works on deploying new build processes. Conversely, of the more than 200 customers I’ve spoken to, only one or two of them have developers working in a build team.
In 1991, Windows NT had only a few hundred thousand lines of code, unlike the more than 40 million lines of code that Windows XP has today. Even in the early stages of developing Windows NT, Microsoft recognized the importance of a good build process.





