Creation, use and analysis of metrics. Software Metric

Fresh collections of materials for download! We have collected packages of materials on current topics, including expert presentations, webinars, articles, implementation examples, etc.
To download materials, click on the appropriate button:

The most widely known and used standard for organizing quality control processes is the ISO 9000 series of standards. For the software development process, the ISO 9001 standard is used, which includes design-by-production. It should be noted that this standard is difficult to use directly in software development quality management, since it is initially focused on the development of industrial products. Especially to support the development processes of software systems by the ISO organization, the ISO 9000-3 guide has been developed, which formulates the requirements of the ISO 9001 quality model for organizing the software development process.

Thus, the requirements of the ISO 9000-3 guideline can be used to assess the quality of the development process in one's own organization or in the organization of contractors. Currently, the 2000 version of the standard is being used everywhere, in which process control is at the forefront, however, in this version of the standard there is no specificity related to software development.

The disadvantage of the ISO 9000 standard is the difficulty of measuring the quality level of the software development process according to the proposed quality model.

Among software developers, especially abroad (primarily in the USA), an alternative quality model is highly rated: CMM - SEI. This quality model was developed at the Software Engineering Institute under the sponsorship of the US Department of Defense. Initially, this quality model was used by government, in particular military, organizations when placing orders for software development. The standard is now widely used to analyze and certify the software development processes of firms producing complex software in critical application areas. Important advantages of the CMM model are the hierarchical nesting of quality models, which allows you to measure and compare levels of process quality in different organizations and ensure effective improvement of process quality.

ISO has also now developed a quality model to measure and improve quality.

In a certain respect, the CMM and ISO quality models are interchangeable, however, in essence, they do not contradict each other, since they are based on the same quality paradigm - TQM - Total Quality Management.

It is important to note that simply having a software development process that meets a high level of quality does not guarantee a high quality product. Having a quality process means that the quality of the resulting product will steadily improve over and over again. Therefore, when making decisions, it is necessary to take into account the time during which a process of the required level of quality in a given technological area is installed and operating. However, the lack of information about the quality of the process means that the quality of the product being developed is unpredictable.

Software product quality

Quality of software components

The development of modern large software systems is now increasingly based on component development (Component Base System - CBS). CBS construction technology can significantly reduce the cost and development time. At the same time, the risk associated with the use of software components developed by different manufacturers in the system increases.

The most effective way to solve this problem is to use metrics to manage quality and risks when building CBS, in order to measure various factors affecting the final quality of the product and eliminate sources of risk. In this case, quality metrics should be used to support decision-making at various stages of the development life cycle on the economic feasibility of using components.

The source codes of the components, as a rule, are inaccessible to system designers; in addition, they provide a complex structured interface. The consequence of this is that there is a significant difference between the metrics that are typically applicable for traditional systems and those for CBS. Most traditional metrics are used during the planning and development phase. The key to quality management when using metrics in the development of component systems is the selection of quality metrics that are applicable at all stages of the life cycle and evaluate both process quality and product quality.

5). Maintainability

Maintainability is a set of properties that indicate the effort required to carry out modifications that include adjusting, improving, and adapting software when the environment, requirements, or functional specifications change.

Maintainability includes subcharacteristics:

– Analyzability – an attribute that defines the required effort to diagnose failures or identify parts that will be modified;

– changeability is an attribute that defines the effort expended on modification, removal of errors, or making changes to eliminate errors or introduce new capabilities into the software or operating environment;

– stability – an attribute indicating the risk of modification;

– testability – an attribute indicating the efforts during validation, verification in order to detect errors and non-compliance with requirements, as well as the need for software modification and certification;

– consistency – an attribute that shows the conformity of a given attribute with those defined in standards, agreements, rules and regulations.

6). Portability– a variety of indicators indicating the software’s ability to adapt to work in new conditions of the runtime environment. The environment can be organizational, hardware and software. Therefore, transferring software to a new execution environment may be associated with a set of actions aimed at ensuring its functioning in an environment different from the environment in which it was created taking into account new software, organizational and technical capabilities.

Portability includes subcharacteristics:

– adaptability – an attribute that determines the efforts spent on adapting to different environments;

– customizability (ease of installation) – an attribute that determines the required effort to launch or install this software in a special environment;

– coexistence – an attribute that determines the possibility of using special software in the environment of the operating system;

– interchangeability – an attribute that provides the possibility of interoperability when working together with other programs with the necessary installation or adaptation of software;

– consistency – an attribute that indicates compliance with standards or agreements for ensuring software portability.

9.1.1. Software Quality Metrics

Currently, a system of metrics has not yet been fully formed in software engineering. There are different approaches and methods for determining their set and measurement methods.

A software measurement system includes metrics and measurement models that are used to quantify its quality.

When determining software requirements, the corresponding external characteristics and their subcharacteristics (attributes) are specified, which determine different aspects of the functioning and management of the product in a given environment. For a set of software quality characteristics specified in the requirements, the corresponding metrics, models for their assessment and a range of measure values for measuring individual quality attributes are determined.

According to the standard, metrics are determined by a model for measuring software attributes at all stages of the life cycle (intermediate, internal metrics) and especially at the stage of testing or functioning (external metrics) of the product.

Let us dwell on the classification of software metrics, the rules for conducting metric analysis and the process of measuring them.

Types of metrics. There are three types of metrics:

– software product metrics that are used to measure its characteristics – properties;

– process metrics that are used to measure a property of the process used to create a product.

– usage metrics.

Software product metrics include:

– external metrics indicating product properties visible to the user;

– internal metrics that indicate properties visible only to the development team.

External metrics product include the following metrics:

– product reliability, which serves to determine the number of defects;

– functionality, with the help of which the presence and correct implementation of functions in the product is established;

– support, with the help of which product resources are measured (speed, memory, environment);

– the applicability of the product, which helps determine the degree of accessibility for study and use;

– costs that determine the cost of the created product.

Internal metrics product metrics include:

– dimensions necessary to measure the product using its internal characteristics;

– the complexity required to determine the complexity of the product;

– styles that serve to define approaches and technologies for creating individual components of a product and its documents.

Internal metrics measure the performance of a product and are relevant to external metrics.

External and internal metrics are set at the stage of forming software requirements and are the subject of planning ways to achieve the quality of the final software product.

Product metrics are often described by a set of models to set various properties and values of the quality model or to make predictions. Measurements are usually carried out after calibration of metrics in the early stages of the project. A common measure is the degree of traceability, which is determined by the number of traces traceable using scenario models (for example, UML) and which can be the number of:

– requirements;

– scenarios and characters;

– objects included in the scenario, and localization of requirements for each scenario;

– object parameters and operations, etc.

ISO/IEC 9126–2 defines the following types of measures:

– a measure of software size in different units of measurement (number of functions, lines in a program, disk memory size, etc.);

– a measure of time (system functioning, component execution, etc.);

– measure of effort (labor productivity, labor intensity, etc.);

– accounting measures (number of errors, number of failures, system responses, etc.).

A specific measure may be the level of use of recycled components and is measured as the ratio of the size of the product made from off-the-shelf components to the size of the overall system. This measure is used to determine the cost and quality of software. Examples of metrics are:

– total number of objects and number of reused ones;

– total number of operations, reused and new operations;

– the number of classes that inherit specific operations;

– the number of classes on which this class depends;

– number of users of a class or operations, etc.

When estimating the total number of certain quantities, average statistical metrics are often used (for example, the average number of operations in a class, the average number of descendants of a class or class operations, etc.).

As a rule, measures are largely subjective and depend on the knowledge of experts making quantitative assessments of the attributes of software product components.

An example of widely used external program metrics are Halstead metrics - these are program characteristics identified on the basis of the static structure of a program in a specific programming language: the number of occurrences of the most frequently occurring operands and operators; length of the program description as the sum of the number of occurrences of all operands and operators, etc.

Based on these attributes, it is possible to calculate programming time, the level of the program (structure and quality) and programming language (abstraction of language tools and orientation to a given problem), etc.

Process metrics include metrics:

– costs that determine the costs of creating a product or project architecture, taking into account originality, support, and development documentation;

– estimates of the cost of specialist work per person – days or months;

– process unreliability – the number of undetected defects during design;

– repeatability, which establishes the extent to which repeated components are used.

Process metrics can include development time, the number of errors found during the testing phase, etc. The following process metrics are practically used:

– total development time and separate time for each stage;

– time of model modification;

– time of work on the process;

– number of errors found during inspection;

– cost of quality control;

– cost of the development process.

Usage metrics serve to measure the degree of satisfaction of the user's needs when solving his problems. They help evaluate not the properties of the program itself, but the results of its operation - operational quality. An example is the accuracy and completeness of the implementation of user tasks, as well as resources (labor costs, productivity, etc.) spent on effectively solving user tasks. The assessment of user requirements is carried out mainly using external metrics.

9.1.2. Standard method for assessing quality indicator values

Software quality assessment according to the four-level quality model starts from the lowest level of the hierarchy, i.e. from the most elementary property of the assessed attribute of the quality indicator according to established measures. At the design stage, the values of the evaluation elements are established for each attribute of the indicator of the analyzed software included in the requirements.

ISO/IES 9126-2 defines a software quality metric as “a model for measuring an attribute associated with a measure of its quality.” To use metrics when measuring quality indicators, this standard allows you to define the following types of measures:

– measures of size in different units of measurement (number of functions, program size, amount of resources, etc.);

– time measures – periods of real, processor or calendar time (system operation time, component execution time, use time, etc.);

– measures of effort – productive time spent on project implementation (labor productivity of individual project participants, collective labor intensity, etc.);

– measures of intervals between events, for example, time between successive failures;

– counting measures – counters for determining the number of detected errors, the structural complexity of the program, the number of incompatible elements, the number of changes (for example, the number of detected failures, etc.).

Quality metrics are used to assess the degree of testability after testing the software on a variety of tests (failure-free operation, feasibility of functions, ease of use of user interfaces, databases, etc.).

MTBF, as a reliability attribute, determines the average time between the appearance of threats that violate security and provides a difficult-to-measure estimate of the damage caused by the corresponding threats.

Very often a program is evaluated by the number of lines. When comparing two programs that implement one application task, preference is given to the short program, since it is created by more qualified personnel and has fewer hidden errors and is easier to modify. It is more expensive, although it takes more time to debug and modify. Those. Program length can be used as an auxiliary property when comparing programs based on similar skill levels of developers, a common development style, and a common environment.

If the software requirements specified obtaining several indicators, then the indicator calculated after data collection during execution is multiplied by the corresponding weighting factor, and then all indicators are summed up to obtain a comprehensive assessment of the software quality level.

Based on measuring quantitative characteristics and conducting an examination of qualitative indicators using weighting coefficients that level out different indicators, the final assessment of product quality is calculated by summing up the results for individual indicators and comparing them with software benchmark indicators (cost, time, resources, etc.).

Those. When assessing an individual indicator using assessment elements, a significant coefficient is calculated k– metric, j- index, i– attribute. For example, as j– let’s take portability as an indicator. This indicator will be calculated based on five attributes ( i = 1, ..., 5), and each of them will be multiplied by the corresponding coefficient k i .

All metrics j– attributes are summed up and form i - level of quality. When all attributes are assessed for each of the quality indicators, a total assessment of the individual indicator is made, and then an integral quality assessment is made, taking into account the weighting coefficients of all software indicators.

Ultimately, the result of quality assessment is a criterion for the effectiveness and feasibility of using design methods, tools and methods for assessing the results of creating a software product at the stages of life cycle.

To present the assessment of the values of quality indicators, a standard is used that presents the following methods: measuring, registration, calculation and expert (as well as combinations of these methods).

Measuring method is based on the use of measuring and special software tools to obtain information about the characteristics of the software, for example, determining the volume, number of lines of code, operators, number of branches in the program, number of entry (exit) points, reactivity, etc.

Registration method used when calculating the time, number of failures or failures, the beginning and end of software operation during its execution.

Calculation method is based on statistical data collected during testing, operation and maintenance of the software. Calculation methods are used to evaluate indicators of reliability, accuracy, stability, reactivity, etc.

Expert method carried out by a group of experts - specialists competent in solving a given task or type of software. Their assessment is based on experience and intuition, and not on the direct results of calculations or experiments. This method is carried out by reviewing programs, codes, accompanying documents and contributes to a qualitative assessment of the created product. For this purpose, controlled characteristics are established, correlated with one or more quality indicators and included in expert survey cards. The method is used to evaluate such indicators as analyzability, documentability, software structure, etc.

To assess the values of quality indicators depending on the characteristics of the properties they use, their purpose, and methods for determining them, the following scales are used:

– metric (1.1 – absolute, 1.2 – relative, 1.3 – integral);

– ordinal (rank), which allows you to rank characteristics by comparison with reference ones;

– classification, characterizing only the presence or absence of the property under consideration in the software being evaluated.

Indicators calculated using metric scales are called quantitative, and those calculated using ordinal and classification scales are called qualitative.

Attributes of a software system that characterize its quality are measured using quality metrics. The metric defines the measure of the attribute, i.e. a variable that is assigned a value as a result of a measurement. To ensure proper use of measurement results, each measure is identified by a measurement scale.

– the nominal scale reflects the categories of properties of the assessed object without their ordering;

– an ordinal scale is used to order characteristics in ascending or descending order by comparing them with basic values;

– an interval scale specifies the essential properties of an object (for example, a calendar date);

– the relative scale specifies a certain value relative to the selected unit;

– the absolute scale indicates the actual value of the quantity (for example, the number of errors in the program is 10).

9.1.3. PS quality management

Under quality management is understood as a set of organizational structure and responsible persons, as well as procedures, processes and resources for planning and managing the achievement of software quality. Quality management – SQM (Software Quality Management) is based on the application of standard provisions for quality assurance – SQA (Software Quality Assurance).

The purpose of the SQA process is to ensure that products and processes are consistent with requirements, consistent with plans and includes the following activities:

– implementation of standards and relevant procedures for developing software at the stages of life cycle;

– assessment of compliance with the provisions of these standards and procedures.

The quality guarantee is as follows:

– checking the consistency and feasibility of plans;

– coordination of intermediate work products with planned indicators;

– checking manufactured products against specified requirements;

– analysis of applied processes for compliance with the contract and plans;

– the development environment and methods are consistent with the development order;

– verification of accepted metrics of products, processes and methods of their measurement in accordance with the approved standard and measurement procedures.

The purpose of the SQM management process is to monitor (systematically control) quality to ensure that the product will satisfy the customer and involves the following activities:

– determination of quantitative quality properties based on identified and anticipated user needs;

– management of the implementation of set goals to achieve quality.

SQM is based on ensuring that:

– goals for achieving the required quality are established for all work products at product control points;

– the strategy for achieving quality, metrics, criteria, techniques, requirements for the measurement process, etc. have been determined;

– actions related to providing products with quality properties are defined and carried out;

– quality control (SQA, verification and validation) and goals are carried out, if they are not achieved, then processes are regulated;

– processes of measuring and evaluating the final product to achieve the required quality are carried out.

The main standard provisions for creating a quality product and assessing the level of achievement highlight two quality assurance processes at the stages of the PS life cycle:

– guarantee (confirmation) of the quality of the software as a result of certain activities at each stage of the life cycle with verification of compliance of the system with standards and procedures focused on achieving quality;

– quality engineering, as the process of providing software products with the properties of functionality, reliability, maintenance and other quality characteristics.

Quality processes are designed to:

a) managing, developing and maintaining assurance in accordance with specified standards and procedures;

b) configuration management (identification, status accounting and authentication actions), risk and project in accordance with standards and procedures;

c) control of the basic version of the software and the quality characteristics implemented in it.

Execution of these processes includes the following actions:

– assessment of the standards and procedures that are followed when developing programs;

– audit of management, development and provision of quality assurance for software, as well as project documentation (reports, development schedules, messages, etc.);

– control of formal inspections and reviews;

– analysis and control of acceptance testing (testing) of the substation.

For an organization that develops software, including from components, software quality engineering must be supported by a quality system , quality management (planning, accounting and control).

Quality Engineering includes a set of methods and activities by which software products are checked to meet quality requirements and are provided with the characteristics provided for in the software requirements.

Quality system(Quality systems - QS) is a set of organizational structures, methodologies, activities, processes and resources for implementing quality management. To ensure the required level of software quality, two approaches are used. One of them is focused on the final software product, and the second is focused on the process of creating the product.

In a product-oriented approach, quality assessment is carried out after testing the PS. This approach is based on the assumption that the more errors are detected and eliminated in a product during testing, the higher its quality.

In the second approach, measures are envisaged and taken to prevent, promptly identify and eliminate errors, starting from the initial stages of the life cycle in accordance with the plan and procedures for ensuring the quality of the developed software. This approach is presented in the ISO 9000 and 9000-1,2,3 series of standards. The purpose of standard 9000–3 is to provide recommendations to development organizations to create a quality system according to the scheme shown in Fig. 9.3.

Joint

Control system Manager work Responsible

Qualities from the contractor from the customer

General Policy

Responsibility

and powers

Controls

Achievement plan

PS quality

Fig.9.3. Standard requirements for organizing a quality system

An important place in quality engineering is given to the process of measuring the characteristics of life cycle processes, its resources and the work products created on them. This process is implemented by the Quality, Verification and Testing team. The functions of this group include: planning, operational management and quality assurance.

Quality planning represents an activity aimed at defining goals and quality requirements. It covers identification, setting of objectives, quality requirements, classification and quality assessment. A calendar plan is drawn up to analyze the state of development and consistently measure the planned indicators and criteria at the stages of the life cycle.

Operational management includes methods and types of operational activities for the ongoing management of the design process, eliminating the causes of unsatisfactory functioning of the substation.

Quality assurance consists of performing and verifying that the development object fulfills specified quality requirements. Quality assurance goals can be internal or external. Internal goals are to create confidence in the project manager that quality is being ensured. External goals are to create confidence in the user that the required quality has been achieved and the result is high-quality software.

Experience shows that a number of companies producing software products have quality systems, which ensures they produce competitive products. The quality system includes monitoring the demand for a new type of product, control of all stages of PS production, including the selection and supply of finished system components.

In the absence of appropriate quality services, software developers must apply their own regulatory and methodological documents regulating the process of software quality management for all categories of developers and users of software products.

9.2. Reliability assessment models

Of all the areas of software engineering, software reliability is the most researched area. It was preceded by the development of the theory of reliability of technical means, which influenced the development of reliability of substations. Substation reliability issues were dealt with by substation developers, trying to ensure reliability that satisfies the customer using various system means, as well as by theorists who, while studying the nature of substation functioning, created mathematical models of reliability that take into account various aspects of substation operation (the occurrence of errors, failures, failures, etc.) and evaluate real reliability. As a result, PS reliability has emerged as an independent theoretical and applied science.

The reliability of complex PS differs significantly from the reliability of equipment. Data storage media (files, server, etc.) are highly reliable; records on them can be stored for a long time without destruction, since they are not subject to destruction and aging.

From the point of view of applied science reliability– this is the ability of a PS to maintain its properties (failure-free operation, stability, etc.) to transform initial data into results over a certain period of time under certain operating conditions. A decrease in the reliability of a PS occurs due to errors in requirements, design and implementation. Failures and errors depend on the method of production of the product and appear in programs during their execution over a certain period of time.

For many systems (programs and data), reliability is the main objective function of the implementation. Some types of systems (real time, radar systems, security systems, medical equipment with built-in programs, etc.) are subject to high reliability requirements, such as error tolerance, reliability, safety, security, etc.

Software Document

Etc. Colored web offered in textbooks, difficult to perceive and understand... its use. MM. Petrukhin GOU VPO "... facilities. Today in softwareengineering There are two main approaches to development softwareprovision ...

In this article I want to look at some of the most important QA metrics in my opinion. These will be such indicators, coefficients and indicators that will allow you to capture the overall picture of what is happening on the project in terms of quality and determine steps to improve it. The metrics will cover 5 different areas: requirements, software quality, testing team effectiveness, quality of QA work and feedback. It is important to measure and track indicators simultaneously in various sections of the software development process in order to detect common, root problems, and be able to configure and optimize the entire process

Group 1 - Requirements for the software being developed

This group of metrics will allow us to evaluate how well we have worked out the requirements (user story) for the software, identify vulnerabilities and the most complex, potentially problematic software features, and understand where special control is required:

1. Test coverage requirements

In other words, this is the number of tests per requirement.

Metric purpose: identify weaknesses in test coverage and highlight risks.

Of course, this metric will only work if the requirements are well decomposed and more or less equivalent. Of course, this is not always possible, but if you can make the requirements sufficiently atomic, then this metric will show the deviation of the coverage of each requirement from the average level. The more the value differs from 1, the fewer/more tests are written for one requirement than usual.
The most important thing to pay attention to are the requirements for which the coefficient will be equal to or close to 0. For these, you need to consider adding tests.
If the requirements are not atomic, then this metric will only ensure that there is at least 1 test for each requirement. For this, the coefficient must always be greater than 0.

2. Degree of interconnectedness of requirements

The metric is calculated as the average number of connections of each requirement with other requirements.

Metric purpose: provide a basis for assessing the timing of testing and taking into account possible risks. Knowing the degree of mutual influence of requirements on each other, you can, for example, plan additional time and cases for end-to-end testing, work on regression checks, look towards integration, etc.

The value of this metric will range from 0 to 1. 1 means that each requirement is related to each one, and 0 means that there is no relationship.
It is difficult to introduce any restrictions on the values of this coefficient; a lot depends on the specifics of the functionality, architecture, and technologies. However, from my own experience I can say that it is good when the degree of connectivity does not exceed 0.2-0.3. Otherwise, modification within the framework of one of the requirements will lead to a chain of changes, and therefore possible errors, in a significant part of the product.

3. Requirements stability factor

Metric purpose: show how many already implemented requirements have to be redone from release to release when developing new features.

Of course, completely isolated functionality does not exist, but the number of new requirements should prevail over the changing ones, and the coefficient should preferably be less than 0.5. In this case, we introduce 2 times more new features than we rework existing ones.
If the coefficient is higher than 0.5, especially if it is greater than 1, then this most likely means that we previously did something that turned out to be unnecessary. The team focuses not on creating new business values, but on reworking previously released features.
The metric also gives an idea of how easily the system’s functionality can be scaled and new features added.

Group 2 - Quality of the product being developed

As the name suggests, this group of metrics demonstrates the quality of the software, as well as the quality of the development itself.

1. Defect density

The percentage of defects per individual module during an iteration or release is calculated.

Metric purpose: highlight which part of the software is the most problematic. This information will help in assessing and planning work with this module, as well as in risk analysis.

The reasons for a large number of defects in any one specific module (coefficient greater than 0.3) can be different: poor quality requirements, developer qualifications, technical complexity, etc. In any case, this metric will immediately draw our attention to the problem area.

2. Regression coefficient

Metric purpose: show where the team’s efforts go: are we more involved in creating and debugging new features or are we forced to patch up existing parts of the software most of the time?

The closer the coefficient is to 0, the fewer errors were introduced into the existing functionality when implementing new requirements. If the value is greater than 0.5, then we spend more than half of the time restoring previously working software functions

3. Reopened defect rate

Metric purpose: assess the quality of development and correction of defects, as well as the complexity of the product or individual module

This metric can be calculated for the entire software, individual module or functionality. The closer the resulting value is to 0, the less old errors are repeated during development.
If the coefficient turns out to be more than 0.2-0.3, this may indicate either the technical complexity of the module and the highly related requirements in it, or a clumsy architecture, or that the previous fix was made poorly.

4. Average cost of fixing a defect

The ratio of the amount of costs incurred by the team when working with all defects (for example, as part of a release) to the total number of defects.

Metric purpose: show how expensive it is for us to detect and correct each defect. This will make it possible to calculate the benefits of reducing the number of mistakes made and evaluate the feasibility of the appropriate techniques.

Of course, there are no correct values here; everything will be determined by the specifics of a particular situation

5. Number of defects in a specific developer’s code

Metric purpose: highlight possible difficulties in the development team, which specialists lack experience, knowledge or time and need help.

If, for example, 50% of all defects are accounted for by 1 developer, and there are 5 of them in the team, then there is clearly a problem. This does not mean that this programmer does not work well, but it signals that it is imperative to understand the reasons for this situation.
The metric, among other things, can be an indicator of a particularly difficult module\functional\system to develop and support.

Group 3 – QA Team Capability and Effectiveness

The main purpose of this group of metrics is to express in numbers what the testing team is capable of. These indicators can be calculated and compared on a regular basis, trends can be analyzed, and one can observe how the team’s work is affected by certain changes.

1. Velocity of the QA team

It is calculated as the ratio of implemented story points (or requirements, or user stories) over several, for example, 4-5 iterations (Sprint) to the number of selected iterations.

Metric purpose: numerically express the capabilities and speed of the team’s work for further planning of the scope of work and analysis of development trends

The metric allows you to monitor the speed of QA work and observe what internal processes or external influences on the team can affect this speed.

2. Average defect lifetime

The total time during which defects found within an iteration or release were open to the sum of defects.

Metric purpose: show how much time on average it takes to work with one defect: to register it, correct it and reproduce it. This indicator will allow you to estimate the time required for testing and highlight areas of the software with which the greatest difficulties arise.

Typically, the lifetime of a defect is the entire time from its creation (Created status) to closure (Closed), minus all possible Postponed and Hold. Any bug tracker allows you to calculate and upload this information for a separate sprint or release.
Also, the average lifetime of a defect can be calculated for various modules and software functions, or, most interestingly, separately for each of the testers and developers from the team. This gives you a chance to identify particularly complex modules or a weak link in the software team.

Group 4 - Quality of work of the testing team

The purpose of this set of metrics is to evaluate how well testers perform their tasks and to determine the level of competencies and maturity of the QA team. Having such a set of indicators, you can compare the team with itself at different points in time or with other, external testing groups.

1. Efficiency of tests and test cases

Metric purpose: show how many errors on average our cases can detect. This metric reflects the quality of the test design and helps to monitor the trend of its change.

It is best to calculate this metric for all sets of tests: for separate groups of functional tests, regression set, Smoke testing, etc.
This indicator of the “lethality” of tests allows you to monitor the effectiveness of each of the kits, how it changes over time and supplement them with “fresh” tests.

2. Rate of errors missed per production

Number of errors discovered after the release \ total number of errors in the software discovered during testing and after release

Metric purpose: demonstrate the quality of testing and the efficiency of error detection - what proportion of defects were filtered out and what proportion went through to production.

The acceptable percentage of errors that were missed in production will, of course, depend on many factors. However, if the coefficient is >0.1, this is bad. This means that every tenth defect was not detected during testing and led to problems in the software already distributed to users.

3. Real work time of the QA team

The ratio of the time spent by the team directly on QA activities to the total number of hours.

Metric purpose: firstly, to increase the accuracy of planning, and secondly, to monitor and manage the performance of a particular team.

Target activities include analysis, design, assessments, testing, work meetings and much more. Possible side effects are downtime due to blockers, communication problems, unavailability of resources, etc.
Naturally, this coefficient will never be equal to 1. Practice shows that for effective teams it can be 0.5-0.6.

4. Accuracy of time estimation by area\types\types of work

Metric purpose: allows the use of a correction factor for subsequent assessments.

The degree of assessment accuracy can be determined for the entire team or individual testers, for the entire system or individual software modules.

5. Share of unconfirmed (rejected) defects

Metric purpose: show how many defects were created “idlely”.

If the percentage of defects that were rejected exceeds 20%, then the team may experience a desynchronization in understanding what is a defect and what is not.

Group 5 - Feedback and User Satisfaction

And finally, a group of metrics showing how the product was accepted by end users, how well it met their expectations. But not only software feedback is important: another important task of this group of metrics is to show whether users are satisfied with the process of interaction with the IT team in general and QA in particular.

1. User satisfaction with IT services

Regular survey of user satisfaction with IT services with scoring.

Metric purpose: show whether users trust the IT team, whether they understand how and why its work is organized, and how well this work meets expectations.

The metric can serve as an indicator that it is necessary to focus on optimizing the process or making it clearer and more transparent for users.
The satisfaction indicator can be calculated based on the results of a survey following the release. We collect all the grades and calculate the average score. This score can then be recalculated after changes are made to the process.

2. User satisfaction with the product

Regular survey of users about how satisfied they are with the product.

Metric purpose: determine how well the product being developed meets user expectations, whether we are moving in the right direction, whether we correctly determine the importance of features and choose solution options.

To calculate this metric, we also conduct a user survey and calculate the average score. By calculating this indicator on a regular basis (for example, after each release), you can monitor the trend in user satisfaction.

3. Stakeholder engagement

Number of initiatives and proposals to improve the process and product received during the iteration (release) from stakeholders

Metric purpose: determine the degree of participation of external stakeholders in the work on the product. Having such a metric in hand, you can navigate where you need to get feedback so that one day you don’t face contempt and hatred, problems and misunderstanding.

We've released a new book, Social Media Content Marketing: How to Get Inside Your Followers' Heads and Make Them Fall in Love with Your Brand.

Let's create a report. In the metrics, select “Achieving goals” – “The goal for which you have set the conversion.” This is usually a “Thank you for your purchase” page.

As a result, we will receive data on how many purchases there were for each Republic of Kazakhstan and how much was spent on attracting users who made them. Divide the number of conversions by the cost of clicks to get the cost of one lead. If you have it set up, you can add a “Revenue” column to estimate the profit received.

Segments for retargeting and bid adjustments: a new level of relationship with potential buyers

In this section, we will create and save Yandex.Metrics and define adjustments to use when setting up campaigns in Direct.

Remember to set the time period so that the sample is representative. It is necessary that the data be built on the basis of the behavior of a large group of visitors.

Gender and age – adjustment

After creating this report, we will be able to see who buys better from our site, men or women, and what is the age of such buyers. After this, nothing will prevent us from adjusting the rates for this segment.

Select: “Reports” – “Visitors” – “Gender” (1).

As a result, we make adjustments for women. At the same time, the data obtained helped us see that representatives of the stronger sex also spend time on our website. You need to work with this information. For example, write relevant advertisements.

Time and clock - adjustment

Your visitors may have different activities during the day or week, so at this point we will identify the most converting days and hours for your resource, after which you can set time adjustments in Direct.

“Reports” – “Visitors” – “Attendance by time of day”.

In the groupings we add: Behavior: date and time – “Date/time fragments” – “Day of week of visit” (2). Select a goal and sort by conversion. We receive a report that shows on what day and at what time the conversion is maximum.

Geography

“Reports” – “Visitors” – “Geography”.

The report will help sites identify regions that are selling better than others. Typically, for many niches, the lion's share of sales comes from Moscow or St. Petersburg and their regions. Therefore, the bulk of advertisers divide their Republic of Kazakhstan into federal cities and the rest of Russia.

The geography report will help you find a course for further fragmentation of campaigns in Direct or identify regional advertising campaigns with weak returns.

Segment “Forgotten cart”

We create: “Reports” – “Visitors” – “Time since first visit”.

For goals, we will select a macro goal - purchase, appointment for a consultation in the office, etc. We sort by conversion. Select the first 2 lines to plot the graph. As a result, we will get information about how much time our customers spend thinking about their purchasing decision. In addition, we will be able to use data in the Direct interface to avoid showing advertising to those who no longer need our offer.

From the report we see that the goal is mainly achieved on the day of the visit, but throughout the month users return and convert.

Now the segment itself. We will create it for those who left the item in the cart but never purchased it.

Let’s go to the already familiar “Sources” – “Summary” report, leave a checkmark only in the “Advertising conversions” column, click + and select from the menu: “Behavior” – “Achieving goals” – “Goal: added to cart” (javascript goal must be set to "Add to Cart" buttons). We save and name the segment, now go to Direct.

We find the ad that we want to show to this segment, click on “Audience selection conditions”, then on “Add condition”.

Reports for website analysis: study and improve

Webvisor

Its data will help us identify the site’s weaknesses and understand what difficulties users encounter.

Let's look at the Webvisor segments for visits in which our macro goal was achieved.

Let's take a sample and see how users achieved it. Perhaps we will understand behavioral patterns of our customers that we were not aware of. What if, before submitting an order, most of them looked at a photo album or interacted with interactive elements on the site, and perhaps spent a long time on reviews? Such data will help you decide how to correctly position and design blocks on the site.

The second segment is users who spent enough time on our website to make a purchase, but never made it. Analyzing such visits will provide insight into the main challenges faced by visitors.

Scrolling/click maps

A scroll map will help you understand which screen your visitors spend more time on. Perhaps some necessary information that will help make a purchasing decision is in the “cold zone” and needs to be moved to another place. For example, a client is advertised only for requests indicating a metro station, and a map with an address and directions is located at the bottom of the page.

The result is a high percentage of refusals, because the location of the organization’s office is important to clients who come with such requests.

In addition to SLOC, quantitative characteristics also include:

number of empty lines,
number of comments,
percentage of comments (the ratio of the number of lines containing comments to the total number of lines, expressed as a percentage),
average number of lines for functions (classes, files),
average number of lines containing source code for functions (classes, files),
average number of rows for modules.

Sometimes an additional distinction is made between the program's stylistic rating (F). It consists of dividing the program into n equal fragments and calculating the estimate for each fragment using the formula F i = SIGN (Ncomm. i / N i - 0.1), where Ncomm. i is the number of comments in the i-th fragment, N i is the total number of lines of code in the i-th fragment. Then the overall score for the entire program will be determined as follows: F = SUM F i .

Also included in the group of metrics based on counting certain units in the program code are Halstead metrics. These metrics are based on the following indicators:

N1 - number of unique program statements, including symbols

Separators, procedure names and operation signs (operator dictionary),

N2 - number of unique program operands (operand dictionary),

N1 - the total number of statements in the program,

N2 - the total number of operands in the program,

N1" - theoretical number of unique operators,

N2" is the theoretical number of unique operands.

Taking into account the introduced notations, we can determine:

N=n1+n2 - program dictionary,

N=N1+N2 - program length,

N"=n1"+n2" - theoretical dictionary of the program,

N"= n1*log 2 (n1) + n2*log 2 (n2) - theoretical program length (for stylistically correct programs, the deviation of N from N" does not exceed 10%)

V=N*log 2 n - program volume,

V"=N"*log 2 n" is the theoretical volume of the program, where n* is the theoretical dictionary of the program.

L=V"/V - programming quality level, for an ideal program L=1

L"= (2 n2)/ (n1*N2) - level of programming quality, based only on the parameters of a real program without taking into account theoretical parameters,

E C =V/(L")2 - difficulty of understanding the program,

D=1/ L" - complexity of program coding,

Y" = V/ D2 - expression language level

I=V/D - information content of the program, this characteristic allows you to determine the mental costs of creating the program

E=N" * log 2 (n/L) - assessment of the required intellectual effort when developing a program, characterizing the number of required elementary solutions when writing a program

When using Halstead metrics, the disadvantages associated with the ability to record the same functionality with a different number of lines and operators are partially compensated.

Another type of software metrics that are quantitative are Gilb metrics. They show the complexity of software based on the program's density of conditional statements or looping statements. This metric, despite its simplicity, quite well reflects the complexity of writing and understanding a program, and when adding such an indicator as the maximum level of nesting of conditional and cyclic statements, the effectiveness of this metric increases significantly.

2. Metrics of program control flow complexity

The next large class of metrics, based not on quantitative indicators, but on the analysis of the program control graph, is called program control flow complexity metrics.

Before directly describing the metrics themselves, for a better understanding, the control graph of the program and the method for constructing it will be described.

Let some program be presented. For this program, a directed graph is constructed containing only one input and one output, while the vertices of the graph are correlated with those sections of the program code in which there are only sequential calculations and there are no branch and loop operators, and the arcs are correlated with transitions from block to block and branches of program execution. Condition for constructing this graph: each vertex is reachable from the initial vertex, and the final vertex is reachable from any other vertex.

The most common estimate based on the analysis of the resulting graph is the cyclomatic complexity of the program (McCabe cyclomatic number). It is defined as V(G)=e - n + 2p, where e is the number of arcs, n is the number of vertices, p is the number of connected components. The number of connected components of a graph can be thought of as the number of arcs that need to be added to transform the graph into a strongly connected one. A graph is called strongly connected if any two vertices are mutually reachable. For graphs of correct programs, i.e. graphs that do not have sections that are unreachable from the entry point and do not have “dangling” entry and exit points, a strongly connected graph is usually obtained by closing an arc from a vertex denoting the end of the program to a vertex denoting the entry point to this program. Essentially, V(G) determines the number of linearly independent circuits in a strongly connected graph. So in correctly written programs p=1, and therefore the formula for calculating cyclomatic complexity takes the form:

V(G)=e - n + 2.

Unfortunately, this evaluation is not able to distinguish between cyclical and conditional structures. Another significant drawback of this approach is that programs represented by the same graphs can have predicates of completely different complexity (a predicate is a logical expression containing at least one variable).

To correct this shortcoming, G. Myers developed a new technique. As an estimate, he suggested taking an interval (this estimate is also called interval), where h for simple predicates is equal to zero, and for n-ary predicates h = n-1. This method makes it possible to distinguish predicates of different complexity, but in practice it is almost never used.

Another modification of the McCabe method is the Hansen method. The measure of program complexity in this case is represented as a pair (cyclomatic complexity, number of statements). The advantage of this measure is its sensitivity to the structure of the software.

Chen's topological measure expresses the complexity of a program in terms of the number of boundary crossings between regions formed by the program graph. This approach is applicable only to structured programs that only allow sequential connection of control structures. For unstructured programs, the Chen measure depends significantly on conditional and unconditional branches. In this case, you can specify the upper and lower bounds of the measure. The top one is m+1, where m is the number of logical operators with their mutual nesting. The lower one is equal to 2. When the control graph of a program has only one connected component, the Chen measure coincides with the McCabe cyclomatic measure.

Continuing the topic of analyzing the control graph of a program, we can distinguish another subgroup of metrics - Harrison and Majel metrics.

These measures take into account the level of investment and length of the program.

Each vertex is assigned its own complexity according to the operator it represents. This initial vertex complexity can be calculated in any way, including using Halstead measures. For each predicate vertex, we select a subgraph generated by the vertices that are the ends of the arcs emanating from it, as well as the vertices reachable from each such vertex (the lower boundary of the subgraph), and the vertices that lie on the paths from the predicate vertex to some lower boundary. This subgraph is called the sphere of influence of the predicate vertex.

The reduced complexity of a predicate vertex is the sum of the initial or reduced complexities of the vertices included in its sphere of influence, plus the primary complexity of the predicate vertex itself.

The functional measure (SCOPE) of a program is the sum of the reduced complexities of all vertices of the control graph.

Functional relation (SCORT) is the ratio of the number of vertices in a control graph to its functional complexity, and terminal vertices are excluded from the number of vertices.

SCORT can take different values for graphs with the same cyclomatic number.

The Pivovarsky metric is another modification of the cyclomatic complexity measure. It allows you to track differences not only between sequential and nested control structures, but also between structured and unstructured programs. It is expressed by the relation N(G) = v *(G) + SUMMAPi, where v *(G) is the modified cyclomatic complexity, calculated in the same way as V(G), but with one difference: a CASE statement with n outputs is considered as one logical operator, rather than as n - 1 operators.

Pi is the nesting depth of the i-th predicate vertex. To calculate the depth of nesting of predicate vertices, the number of “spheres of influence” is used. The depth of nesting is understood as the number of all “spheres of influence” of predicates that are either completely contained in the sphere of the vertex in question or intersect with it. The depth of nesting increases due to the nesting not of the predicates themselves, but of “spheres of influence”. The Pivovarsky measure increases when moving from sequential programs to nested ones and then to unstructured ones, which is its huge advantage over many other measures of this group.

Woodward measure - the number of intersections of arcs of the control graph. Since such situations should not arise in a well-structured program, this metric is used mainly in weakly structured languages (Assembler, Fortran). An intersection point occurs when control goes beyond two vertices that are sequential operators.

The boundary value method is also based on the analysis of the control graph of the program. To define this method, it is necessary to introduce several additional concepts.

Let G be the control graph of a program with a single initial and a single final vertex.

In this graph, the number of incoming vertices of arcs is called the negative degree of the vertex, and the number of arcs emanating from the vertex is called the positive degree of the vertex. Then the set of graph vertices can be divided into two groups: vertices with positive degree<=1; вершины, у которых положительная степень >=2.

We will call the vertices of the first group receiving vertices, and the vertices of the second group - selection vertices.

Each receiving vertex has a reduced complexity of 1, except for the end vertex, which has a reduced complexity of 0. The reduced complexities of all vertices in G are summed to form the absolute boundary complexity of the program. After this, the relative marginal complexity of the program is determined:

Where S0 is the relative marginal complexity of the program, Sa is the absolute marginal complexity of the program, v is the total number of vertices of the program graph.

There is a Schneidewind metric, expressed in terms of the number of possible paths in the control graph.

3. Data flow complexity metrics

The next class of metrics is data control flow complexity metrics.

Chapin metric: the essence of the method is to assess the information strength of a single software module by analyzing the nature of the use of variables from the input-output list.

The entire set of variables that make up the I/O list is divided into 4 functional groups:

1. P - input variables for calculations and for providing output,

2. M - variables modified or created within the program,

3. C - variables involved in controlling the operation of the software module (control variables),

Since each variable can perform several functions simultaneously, it must be considered in each corresponding functional group.

Chapin metric:

Q = a1*P + a2*M + a3*C + a4*T,

Where a1, a2, a3, a4 are weighting coefficients.

Q = P + 2M + 3C + 0.5T

The span metric is based on the localization of data accesses within each program section. Spen is the number of statements containing a given identifier between its first and last appearance in the program text. Therefore, an identifier that appears n times has a span of n-1. With a large span, testing and debugging becomes more complicated.

Another metric that takes into account the complexity of a data flow is a metric that relates the complexity of programs to accesses to global variables.

A module-global variable pair is denoted by (p,r), where p is the module that has access to the global variable r. Depending on the presence in the program of a real reference to the variable r, two types of “module - global variable” pairs are formed: actual and possible. The possible reference to r by p shows that the domain of existence of r includes p.

This characteristic is denoted by Aup and indicates how many times Up modules actually accessed global variables, and the number Pup indicates how many times they could have accessed them.

The ratio of the number of actual calls to possible ones is determined

This formula shows the approximate probability of an arbitrary module referencing an arbitrary global variable. Obviously, the higher this probability, the higher the probability of “unauthorized” change of any variable, which can significantly complicate the work associated with modifying the program.

The Kafur measure was created based on the concept of information flows. To use this measure, the concepts of local and global flow are introduced: a local flow of information from A to B exists if:

1. Module A calls module B (direct local thread)

2. Module B calls module A and A returns B a value that is used in B (indirect local stream)

3. Module C calls modules A, B and transfers the result of executing module A to B.

Next, we should give the concept of global information flow: a global flow of information from A to B through a global data structure D exists if module A puts information into D and module B uses information from D.

Based on these concepts, the value I is introduced - the information complexity of the procedure:
I = length * (fan_in * fan_out)2
Here:

Length - complexity of the procedure text (measured through one of the volume metrics, such as Halstead, McCabe, LOC metrics, etc.)

Fan_in - the number of local threads entering the procedure plus the number of data structures from which the procedure takes information

Fan_out - the number of local threads emanating from the procedure plus the number of data structures that are updated by the procedure

The information complexity of a module can be defined as the sum of the information complexities of the procedures included in it.

The next step is to consider the information complexity of a module relative to some data structure. Informational measure of module complexity relative to the data structure:

J = W * R + W * RW + RW *R + RW * (RW - 1)

W is the number of procedures that only update the data structure;

R - only read information from the data structure;

RW - both read and update information in the data structure.

Another measure in this group is the Oviedo measure. Its essence is that the program is divided into linear non-intersecting sections - rays of operators that form the control graph of the program.

The author of the metric makes the following assumptions: the programmer can find the relationship between defining and using occurrences of a variable within a ray more easily than between rays; The number of distinct defining occurrences in each ray is more important than the total number of using variable occurrences in each ray.

Let us denote by R(i) the set of defining occurrences of variables that are located within the radius of action of ray i (a defining occurrence of a variable is within the radius of action of a ray if the variable is either local in it and has a defining occurrence, or for it there is a defining occurrence in some previous ray, and there is no local definition along the path). Let us denote by V(i) the set of variables whose occurrences are already in ray i. Then the measure of complexity of the i-th ray is given as:

DF(i)=SUM(DEF(v j)), j=i...||V(i)||

Where DEF(v j) is the number of defining occurrences of the variable v j from the set R(i), and ||V(i)|| - power of the set V(i).

4. Metrics of control flow and program data complexity

The fourth class of metrics are metrics that are close to both the class of quantitative metrics, the class of program control flow complexity metrics, and the class of data control flow complexity metrics (strictly speaking, this class of metrics and the class of program control flow complexity metrics are the same class - topological metrics, but it makes sense to separate them in this context for clarity). This class of metrics establishes the complexity of the program structure both on the basis of quantitative calculations and on the basis of analysis of control structures.

The first of these metrics is the testing M-Measure. A testing measure M is a measure of complexity that satisfies the following conditions:

The measure increases with the depth of nesting and takes into account the length of the program. Closely related to the testing measure is a measure based on regular embeddings. The idea of this measure of program complexity is to count the total number of characters (operands, operators, parentheses) in a regular expression with the minimum required number of parentheses describing the control graph of the program. All measures in this group are sensitive to the nesting of control structures and the length of the program. However, the level of computational complexity increases.

Also, a measure of software quality is the coherence of program modules. If the modules are tightly coupled, then the program becomes difficult to modify and difficult to understand. This measure is not expressed numerically. Types of module connectivity:

Data connectivity - if modules interact through the transfer of parameters and each parameter is an elementary information object. This is the most preferred type of connection (coupling).

Connectedness by data structure - if one module sends another a composite information object (structure) for data exchange.

Control coupling - if one sends another an information object - a flag designed to control its internal logic.

Modules are linked by a common area if they refer to the same global data area. Coupling over a global scope is undesirable because, first, a bug in a module that uses the global scope can unexpectedly appear in any other module; secondly, such programs are difficult to understand, since it is difficult for the programmer to determine what data is used by a specific module.

Cohesion by content - if one of the modules links inside another. This is an unacceptable type of coupling, as it completely contradicts the principle of modularity, i.e. representation of the module as a black box.

External coupling - two modules share external data, such as a communication protocol.

Connectivity using messages is the most free type of connectivity; modules are not directly connected to each other; they communicate through messages that do not have parameters.

Lack of coupling - modules do not interact with each other.

Subclass coupling is a relationship between a parent class and a child class, where the child is related to the parent, but the parent is not related to the child.

Time relatedness - two actions are grouped in one module only because, due to circumstances, they occur at the same time.

Another measure related to the stability of a module is the Colofello measure, it can be defined as the number of changes that are required to be made in modules other than the module whose stability is being checked, and these changes must concern the module being tested.

The next metric from this class is the McClure Metric. There are three stages in calculating this metric:

1. For each control variable i, the value of its complexity function C(i) is calculated using the formula: C(i) = (D(i) * J(i))/n.

Where D(i) is a value that measures the scope of variable i. J(i) is a measure of the complexity of the interaction of modules through the variable i, n is the number of individual modules in the partitioning scheme.

2. For all modules included in the partitioning sphere, the value of their complexity functions M(P) is determined by the formula M(P) = fp * X(P) + gp * Y(P)
where fp and gp are, respectively, the number of modules immediately preceding and immediately following module P, X(P) is the complexity of accessing module P,

Y(P) - complexity of calling control from module P of other modules.

3. The overall complexity of the MP hierarchical scheme for dividing a program into modules is given by the formula:

MP = SUM(M(P)) over all possible values of P - program modules.

This metric is focused on programs that are well structured, composed of hierarchical modules that define the functional specification and control structure. It is also assumed that each module has one entry point and one exit point, the module performs exactly one function, and the modules are called in accordance with a hierarchical control system that specifies the call relationship on multiple program modules.

There is also a metric based on the information concept, the Berlinger measure. The complexity measure is calculated as M=SUMf i *log 2 p i , where f i is the frequency of occurrence of the i-th symbol, p i is the probability of its appearance.

The disadvantage of this metric is that a program containing many unique characters, but in small quantities, will have the same complexity as a program containing a small number of unique characters, but in large quantities.

5. Object-oriented metrics

With the development of object-oriented programming languages, a new class of metrics has emerged, also called object-oriented metrics. In this group, the most commonly used are the Martin metrics set and the Chidamber and Kemerer metrics set. First, let's look at the first subgroup.

Before starting to consider Martin metrics, it is necessary to introduce the concept of class categories. In reality, a class can rarely be reused in isolation from other classes. Almost every class has a group of classes with which it works in cooperation, and from which it cannot be easily separated. To reuse such classes, you must reuse the entire group of classes. Such a group of classes is strongly connected and is called a category of classes. For the existence of a class category, the following conditions exist:

Classes within a class category are locked from any attempt to change them all together. This means that if one class were to change, all classes in that category were likely to change. If any one of the classes is open to some kind of change, they are all open to that kind of change.

Classes in a category are only reused together. They are so interdependent and cannot be separated from each other. Thus, if any attempt is made to reuse one class in a category, all other classes must be reused with it.

The responsibility, independence, and stability of a category can be measured by counting the dependencies that interact with that category. Three metrics can be defined:

1. Ca: Centripetal cohesion. The number of classes outside this category that depend on the classes inside this category.

2. Ce: Centrifugal clutch. The number of classes within this category that depend on classes outside this category.

3. I: Instability: I = Ce / (Ca+Ce). This metric has a range of values.

I = 0 indicates the most stable category.

I = 1 indicates the most unstable category.

You can define a metric that measures the abstractness (if a category is abstract, then it is flexible enough and can be easily expanded) of a category as follows:

A: Abstractness: A = nA / nAll.

NA - number_of_abstract_classes_in_category.

NAll - total_number_of_classes_in_category.

The values of this metric vary in the range.

Now, based on the above Martin metrics, you can build a graph that shows the relationship between abstractness and instability. If we construct a straight line on it, given by the formula I+A=1, then this straight line will contain categories that have the best balance between abstraction and instability. This line is called the main sequence.

Distance to main sequence: D=|(A+I-1)/sqrt(2)|

Normalized distance to the main sequence: Dn=|A+I-2|

It is true of almost any category that the closer it is to the main sequence, the better.

The next subgroup of metrics is the Chidamber and Kemerer metrics. These metrics are based on analysis of class methods, inheritance tree, etc.

WMC (Weighted methods per class), the total complexity of all methods of the class: WMC=SUMAc i , i=1...n, where c i is the complexity of the i-th method, calculated using any of the metrics (Halstead, etc. in depending on the criterion of interest), if all methods have the same complexity, then WMC=n.

DIT (Depth of Inheritance tree) - the depth of the inheritance tree (the longest path along the class hierarchy to a given class from the ancestor class), the greater the better, since with greater depth the abstraction of data increases, the saturation of the class with methods decreases, but with a sufficiently large With depth, the complexity of understanding and writing a program increases greatly.

NOC (Number of children) - the number of descendants (immediate), the more, the higher the data abstraction.

CBO (Coupling between object classes) - coupling between classes, shows the number of classes with which the original class is associated. For this metric, all the statements introduced earlier for the connectivity of modules are valid, that is, with a high CBO, data abstraction decreases and class reuse becomes more difficult.

RFC (Response for a class) - RFC=|RS|, where RS is the response set of a class, that is, the set of methods that can potentially be called by a class method in response to data received by an object of the class. That is, RS=(((M)((R i )), i=1...n, where M are all possible methods of the class, R i are all possible methods that can be called by the i-th class. Then RFC will be the power of this set. The more RFCs, the more difficult the testing and debugging.

LCOM (Lack of cohesion in Methods) - lack of cohesion of methods. To determine this parameter, consider class C with n methods M1, M2,...,Mn, then (I1),(I2),...,(In) are sets of variables used in these methods. Now we define P - a set of pairs of methods that do not have common variables; Q is a set of pairs of methods that have common variables. Then LCOM=|P|-|Q|. Lack of cohesion may be a signal that the class can be broken into several other classes or subclasses, so it is better to increase cohesion to increase data encapsulation and reduce the complexity of classes and methods.

6. Reliability metrics

The next type of metrics are metrics that are close to quantitative, but based on the number of errors and defects in the program. There is no point in considering the features of each of these metrics, it will be enough to simply list them: the number of structural changes made since the last check, the number of errors identified during code review, the number of errors identified during program testing and the number of necessary structural changes required for correct program operation. For large projects, these indicators are usually considered in relation to thousands of lines of code, i.e. average number of defects per thousand lines of code.

7. Hybrid metrics

Finally, it is necessary to mention another class of metrics called hybrid ones. Metrics of this class are based on simpler metrics and represent their weighted sum. The first representative of this class is the Kokol metric. It is defined as follows:

H_M = (M + R1 * M(M1) +… + Rn * M(Mn)/(1 + R1 +… + Rn)

Where M is the basic metric, Mi are other interesting measures, Ri are correctly selected coefficients, M(Mi) are functions.

The functions M(Mi) and coefficients Ri are calculated using regression analysis or problem analysis for a specific program.

The Zolnovsky, Simmons, Thayer metric is also a weighted sum of various indicators. There are two options for this metric:

(structure, interaction, volume, data) SUM(a, b, c, d).

(interface complexity, computational complexity, I/O complexity, readability) SUM(x, y, z, p).

The metrics used in each option are selected depending on the specific task, the coefficients - depending on the value of the metric for making a decision in this case.

Conclusion

To summarize, I would like to note that there is not a single universal metric. Any monitored metric characteristics of the program must be monitored either depending on each other or depending on the specific task; in addition, hybrid measures can be used, but they also depend on simpler metrics and also cannot be universal. Strictly speaking, any metric is just an indicator that strongly depends on the language and programming style, so no measure can be elevated to an absolute and no decisions can be made based only on it.