Automatic building of PHP projects using PhiNG. How to bring it all together

Jenkins? Assembly? PHP? - you ask and make surprised eyes O_o. What does the project assembly that comes naturally to most compiled languages ​​like C or Java have to do with PHP projects? They are not compiled and not “assembled”.

When I was looking for materials on continuous integration, I could not find a single simple article that would explain in simple Russian why (why) all this is needed in relation to PHP. Therefore, in this post I will tell you, firstly, what’s what, and secondly, how to use it in everyday life.

Who is this post for:

Developers, team leaders and project managers who have grown to the idea that delivering new versions software product(even in PHP) should be stable and regular, like on the assembly line of a Japanese factory. It is advisable that the words “automated testing” and “deployment” are not an empty phrase for you. Or at least you are interested in how to (damn it) stop posting “important” updates.

Also, the post may be useful for teams headed by a typical marketer or other humanist, and they are forced to at least control the quality of the code internally.

Because I want to write review post, but not step by step instructions for dummies, first a few basic concepts on Ubuntu example Server:

sudo apt-cache search – search for package name;
sudo apt-get install – install package name;
pear list -a – list installed packages PEAR;
pear config-set auto-discover 1 – auto-discover PEAR channels (convenient so as not to add these channels manually);
pear install –alldeps – installs the PEAR package name with all dependencies.

Theory:

No assembly required general case, not in PHP projects. Therefore, forget the word “build” and think about a set of some important procedures that should be completed regularly on a schedule or before the next release of the project. This could be a posting new version product or site update, or some other event.

The point is that day after day you commit, commit, commit changes, and periodically you want to check if anything is broken? Are all business functions working? Did the new employee mess up? Has that lazy guy in the corner finally learned how to correctly use Camel case in names and use the indentations and parentheses accepted by the team? Isn’t the volume of “dead” code that is no longer used anywhere growing? What about mindless copy-paste, redundant classes, unused variables?

And like any normal lazy person, you want to do all this automatically. And if something is wrong, then both you and the specific culprit should receive a report. So, let's assume that you have a certain server with Linux on board, where there is a web environment and where you already know how to deploy a project from svn/git/etc.

Tools:

Jenkins is written in hellish Java, with a terrible GUI, a hellishly slow continuous integration “server”. There is almost no meaning in it, except that it is a ready-made basis for the listed tasks. He himself can do almost nothing, but only manages third-party processes and plugins. As will become clear later, this is what we need.

Ant is a modest program for executing a set of shell commands in Linux. It differs from a simple sh file in several useful features, such as a convenient syntax for manipulating files and directories, the order and dependencies of task execution, and other little things. It is configured in a simple xml file, nothing complicated. Used to run the following tools sequentially. Can also be used for “cleaning up” temporary files, documentation assembly (PHPDoc), compression of CSS/JS files and any other tasks.

Phpunit is an application (essentially a set of PHP scripts) for running automated tests for your product. Why do I write “application” - because it is convenient to launch it from command line and he has a lot various parameters. In addition to the actual tests, it can draw pictures of the test coverage of your project.

PHP CodeSniffer is a PEAR package that monitors compliance with coding standards (indents, line lengths, variable names, etc.). You can use ready-made standards or write your own. If you have a team that rages every day with holivars like “4 spaces versus tabs” or “should I use spaces to separate a point in string concatenation”, this is it.

PHP Mess Detector is one of the most useful PEAR packages in this collection. Maintains code quality from the point of view of the code (not its writing style). For example, he will reproach you if you have too many public methods in the class. In general, it will tell you where refactoring awaits you.

PHP Copy/Paste Detector – as the name implies, a PEAR package for tracking copy-paste. Don’t mindlessly copy-paste, remember?

PHP Dead Code Detector is another PEAR package with a self-explanatory name. Finds unused sections of code. By the way, this is the only package that I couldn’t attach to Jenkins (if you know how, welcome in the comment).

PHP Depend is the most hellishly incomprehensible PEAR package that builds complex graphics and “pyramids” of dependencies, heritability, and much more. It’s worth using if you’ve already figured out everything else.

You also need to install a set of plugins for the bundle Jenkins-PHP following the example from the jenkins-php.org website (installing it manually through the admin panel takes an hour, so use the ready-made commands from the “using Jenkins CLI” paragraph).

How to put it all together:

So, you took a separate Linux server and installed all this equipment there. First you need to decide on a working directory. It will be convenient if this is a directory at a higher level relative to where your project is deployed from the version control system.

Now you need to configure Ant, namely compose for it configuration file build.xml in the working directory you defined in the paragraph above. There are excellent docs for Ant, google it and get examples. Personally, I use Ant to run tests and all of the above analyzers. The results of their work are compiled into a special subdirectory.

Then you need to create a project (task) in Jenkins. There are ready-made documents for this too, I’ll mention a few important points:

– check that the correct base directory is used in the project settings, because the paths to the result files will be specified relative to it;
– configure an update from the version control system;
– in the “Post-build Actions” section, connect the plugins: PMD, Checkstyle, Duplicate code, Coverage (Clover), JDepend, and others to your taste.

What should be the result:

Clicking “Build Now” should start the build process. Find an item with a name like “View console” (the translation is lame, yes) - and there you will see the entire log of what is happening. For example, the steps could be:

– update from the version control system;
– launch Ant, which will launch phpunit, phpmd, phpcs, etc. within itself;
– parsing of results specified by you in post-build actions.

As a result you will get a bunch various statistics about what is happening in your project at the time of this “build”. Not only the test results, but also the results of the above analyzers. Jenkins displays it in a relatively readable form right in the browser.

This is what the project summary page looks like:

To be continued (with pictures and tips).

I have a PHP script that has a large array of people, it grabs its data from an external resource via SOAP, modifies the data and sends it back. Due to the size of the parts, I increased the PHP memory to 128 MB. After about 4 hours of running (it will probably take 4 days), it ran out of memory. Here are the basics of what it does:

$people = getPeople(); foreach ($people as $person) ( $data = get_personal_data(); if ($data == "blah") ( importToPerson("blah", $person); ) else ( importToPerson("else", $person); ) )

After it ran out of memory and crashed, I decided to initialize $data before the foreach loop and according to top the memory usage for the process didn't go above 7.8% and it lasted 12 hours.

So my question is: Doesn't PHP's garbage collection run on variables initialized inside a loop, even if they are reused? Is the system a memory fix and PHP hasn't marked it as useful yet and will eventually crash again (I bumped it up to 256MB so I changed 2 things and not sure if I fixed it, I could probably change your script back to answer this question, but don't want to wait another 12 hours for it to crash to figure it out)?

I don't use the Zend framework, so I don't think another question like this is relevant.

EDIT: I don't have a problem with the script or what it does. On this moment, as far as the whole system is concerned, I have no problems. This question is about the garbage collector and how/when it reclaims resources in a foreach loop and/or how the system reports memory usage in a php process.

2 answers

I don't know the internals of the PHP VM, but in my experience it is not garbage collected as long as your page is running. This is because it deletes your entire page when it runs out.

In most cases, when a page runs out of memory and the limit is quite high (and 128 MB is low), there is a problem with the algorithm. Many PHP programmers collect a data structure and then pass it to next step, which iterates through a structure, usually creating another one. Wet, rinse, repeat. Unfortunately, this approach is a big memory hit and you create multiple copies of your data in memory. Two of the very big changes in PHP 5 were that objects are counted rather than copied, and the entire string subsystem runs much faster. But it's still a problem.

To minimize memory usage, you would look at restructuring your algorithm so that it can work on one piece of data from start to finish. Then you get the next one and start again. The best case scenario is that you never had the entire data set in memory. For a database-enabled website, this would mean processing a row of data from a database query through a presentation until the next one is received. Of course, this approach is not always possible, and the script simply needs to store great amount data in memory.

However, you can do this memory saving approach for a portion of the data. The trick is that you explicitly unset() a key variable or two at the end of the loop. This should give back the space. Another "best practice" trick is to move out of loop data processing that shouldn't be in the loop. As you seem to have discovered.

I'm launching PHP scripts that require 1 GB of memory. In fact, you can set the memory limit on a script using ini_set("memory_limit", "1G");

During the data collection process, our collectors will work around the clock and we need to monitor their performance. For example, the server crashed, the collection conditions changed, or the lights simply turned off. In order not to constantly monitor the work, a mechanism is needed to notify data collectors about the progress of the work.…

In previous articles, I described how to create multi-threaded search collectors in PHP using the following systems: Google Suggest - search suggestions collector Google Yandex Suggest - search suggestions collector Yandex Rambler Suggest - a collector of search tips Rambler Nigma Suggest - a collector of search tips Nigma...

Let's go further to the places where you can get search tips. The next search engine is Rambler. You can also get quite decent search queries from it. Looking on the Internet, we find that the following address is used for requests. All this fits well with our...

Another place where you can get search queries is tips search engine Nigma.ru. By typing in it, for example, “search terms” we get the following: Having searched on the Internet, we determine that the request for search tips goes to the following address. That is, at the very end...

When we type words into Google, we are given certain search suggestions. That is, for example, when entering business automation, Google offers us the following options search queries: By searching on the Internet, you can find the URL where this data is requested: that is, in the very...

In previous posts, I created a multi-threaded search engine collector Google results API in PHP. From experience, it turned out that the speed of its operation strongly depended on the quality of the proxies used. Today I modified the mechanism for using proxies in this collector. For this it was completely...

DNS Lookup support library.

  • Number4 additions to the mathematical part.
  • Win32Build ready-made libraries required for assembly.
  • mssql libraries for programming under MS SQL 6.5.
  • CVS client for CVS repository you need to download source PHP.
  • You will also need files from MSVC++ 6.0 (only for MSVC++ 5.0 users)

    • OLE- put in VC\include. Required for COM support.
    • HTTP- rename existing ones to *.hold, and put new ones to VC\include. Required to build the ISAPI filter.

    INSTALLATION

    We install the CYGWIN package, for example in the directory C:\Program Files\Cygnus, in NT you need to create environment variable CYGWIN with the value %SystemDrive%\Program Files\cygnus\cygwin-b20

    This is done like this: Go to Start->Settings->Control Panel, launch the System shortcut, select the Environment card on it, click on the System variables window at the bottom, two lines Variable and Value, in the Variable line write CYGWIN and in the Value line %SystemDrive%\Program Files \cygnus\cygwin-b20 And click Set and then Apply.

    We create from the root Tmp directory on system disk and in the same way add the TMP variable with the value %SystemDrive%\Tmp. Add to path variable path %SystemDrive%\Tmp

    Click Apply and OK. After this you need to reboot.

    Expand the win32build.zip archive into a directory, for example C:\Win32build

    Launch the MSVC++ 5.0 environment, go to Tools->Options to the Directory card and add to the sections

    • Include
    • Libraries

    the following paths respectively:

    • C:\Program Files\cygnus\cygwin-b20\H-i586-cygwin32\bin
    • C:\Win32Build\include
    • C:\Win32Build\lib

    Expand Bindlib_w32.zip into a directory, for example C:\Bindlib, look for the project file bindlib.dsp in it and collect it from the MSVC environment. resolve.lib will appear in the C:\Bindlib\Debug directory; it must be rewritten to the C:\Win32Build\Lib directory on top of the old one.

    ASSEMBLY

    We go to the C:\Php4\Tsrm directory and collect TSRM.dsp, after assembly, Tsrm.lib will appear in the C:\Php4\Tsrm\Debug directory, copy it to the C:\Win32Build\Lib directory.

    Go to the C:\Php4\Zend directory, open the ZendTS.dsp project and build it. After assembly, the ZendTS.lib library will appear in the C:\Php4\Zend\Debug directory, copy it to the C:\Win32Build\Lib directory.

    Go to the C:\Php4 directory and open the php4ts.dsp project. There are 4 projects in it, first we collect php4ts (do not forget to set the active project Project->Set Active Project->php4ts).

    After assembly, the file php4ts.lib, php4ts.dll, php.exe will appear in the C:\Php4\Debug directory. Php4ts.lib is again copied to C:\Win32Build\Lib. This library is needed to build external modules php, for example for the mssql support module.

    Let's make the php4isapi project active and build it. In the directory C:\php4\sapi\isapi\debug there will be php4isapi.dll – a filter for IIS.

    Known bugs that greatly interfere with work and ways to treat them

    1. Data larger than 4K in size are not posted through form elements; files larger than 4K in size are not uploaded. The php4.exe process freezes and can only be cleared by restarting IIS. (I filed a bug report, but when they fix it, it’s unclear, I’ll try to fix it myself). Solution: Works around using php 3.12-3.14 for processing form react files. Slower, but it works.

    2. When executed dynamic code in which serialized variables are used, the parser issues an error message. For example:

    Result: parser error on line 5…..

    This happens because in version 4 the () symbols are used to encapsulate variables in a string, for example $a=”This is the element ($NotSer)”;

    But the serialized representation of an array also contains () characters.

    In the example described above, the parser tries to encapsulate an expression, which in fact is not an expression and naturally fails with an error.

    Solution: You need to correct the code in the file \php4\ext\standard\var.c, replacing ( with [ and ) with ]. We look for the substring %d:( in the file and replace it with %d:[ (there will be two replacements), and change ( with [, In the line if (**p != ":" || *((*p) + 1 ) != "(") replacing ( with [, for lines for ((*p) += 2; **p && **p != ")" && i > 0; i--) and return *(( *p)++) == ")"; change ) to ] Then we rebuild php4ts.dsp. After that everything works fine.

    Building Php 4.03 betta module for MS SQL 6.5 (mssql.dll)

    Unpack mssql.zip. *.lib is thrown into C:\Win32Build\Lib, and *.h into C:\Win32Build\Include.