Data collection program. A program for collecting clinical and statistical data about patients "medical statistics". WP Uniparser Plugin

For a long time, the free AIDA32 utility was the best program for collecting system information, and did not have any worthy analogues. It provided complete information on almost every hardware and software. It also allowed us to check the network environment and conduct memory performance tests.

However, in March 2004, the developer announced that the development of AIDA32 would be frozen, and the main development would be transferred to another company. Where the development of AIDA32 was subsequently continued, but as a commercial product called Everest. When Everest was acquired by FinalWare in 2010, development of the Everest product was discontinued. However, the product itself continued to exist, but under the name AIDA64, which still exists today. Unfortunately, this product only has trial versions.

Review of free programs for collecting computer information

AIDA32 aka Everest Home for collecting information about your computer

However, you may still find the old version. And there is still a free version of Everest called . The older version of AIDA32 works better at collecting data about the network environment, while Everest covers more modern hardware. So even though they are essentially the same product, you can use both products at once to get maximum results.

The Belarc Advisor program is an analogue of AIDA32 for collecting information about the system

If you need to take an inventory of the hardware of a single computer, then it will come in handy. This program is free for non-commercial use. Of course, it is inferior in coverage to AIDA32, but it has one important advantage. It is actively developing. So the time will come, and the program will overtake AIDA32.

HWiNFO program for convenient system inventory

SIW (System Information for Windows)

Detailed results, portable.
Does not support Windows 8 and higher. The free version is no longer updated.

PC Wizard

Quite detailed information. Not a bad benchmark. Updated regularly
The installer contains "Ask Toolbar" (you don't have to install it)

Belarc Advisor

Actively developing
Not as powerful as AIDA32

We reviewed the basic concepts and terms within the Data Mining technology. Today we’ll take a closer look at Web Mining and approaches to extracting data from web resources.

Web Mining is the process of extracting data from web resources, which, as a rule, has a more practical component than a theoretical one. The main goal of Web Mining is to collect data (parsing) and then save it in the required format. In fact, the task comes down to writing HTML parsers, and we’ll talk about this in more detail.

There are several approaches to data extraction:

  1. DOM tree analysis, using XPath.
  2. Parsing strings.
  3. Using regular expressions.
  4. XML parsing.
  5. Visual approach.
Let's consider all approaches in more detail.

DOM tree analysis

This approach is based on DOM tree analysis. Using this approach, data can be obtained directly by the identifier, name or other attributes of a tree element (such an element can be a paragraph, table, block, etc.). In addition, if an element is not designated by any identifier, then it can be reached along some unique path, going down the DOM tree, for example:

Or go through a collection of similar elements, for example:

Advantages of this approach:

  • you can obtain data of any type and any level of complexity
  • Knowing the location of the element, you can get its value by specifying the path to it
Disadvantages of this approach:
  • Different HTML/JavaScript engines generate the DOM tree differently, so you need to be tied to a specific engine
  • The element's path may change, so, as a rule, such parsers are designed for a short period of data collection
  • The DOM path can be complex and not always unambiguous
This approach can be used in conjunction with the Microsoft.mshtml library, which is essentially. is the core element in Internet Explorer.
HtmlDocument doc = new HtmlDocument();
doc.Load("file.htm" );
foreach (HtmlNode link in doc.DocumentElement.SelectNodes("//a[@href" ])
{
HtmlAttribute att = link["href" ];
att.Value = FixLink(att);
}
doc.Save("file.htm" );

Parsing strings

Even though this approach cannot be used for writing serious parsers, I will talk a little about it.

Sometimes data is displayed using some kind of template (for example, a table of characteristics of a mobile phone), when the parameter values ​​​​are standard, and only their values ​​change. In this case, the data can be obtained without analyzing the DOM tree, but by parsing strings, for example, as is done in the Data Extracting SDK:

Company: Microsoft
Headquarters: Redmond

Code:

string data = "

Company: Microsoft

Headquarters: Redmond

"
;
string company = data.GetHtmlString("Company: " , "

" );
string location = data.GetHtmlString("Headquarters: " , "

" );

//output
// company = "Microsoft"
// location = "Redmont"

* This source code was highlighted with Source Code Highlighter.

Using a set of methods to parse strings is sometimes (usually in simple template cases) more effective than parsing a DOM tree or XPath.

Regular Expressions and XML Parsing

Very often I saw when HTML was completely parsed using regular expressions. This is a fundamentally wrong approach, since this way you can get more problems than benefits.

Regular expressions should only be used to extract data that has a strict format - email addresses, phone numbers, etc., in rare cases - addresses, template data.

Another inefficient approach is to treat HTML as XML data. The reason is that HTML is rarely valid, i.e. such that it can be treated as XML data. Libraries that implemented this approach spent more time converting HTML to XML and only then directly parsing the data. Therefore, it is better to avoid this approach.

Visual approach

At the moment, the visual approach is at an early stage of development. The essence of the approach is that the user can “configure” the system without using a programming language or API to obtain the necessary data of any complexity and nesting. I have already written about something similar (though applicable in a different area) - methods for analyzing web pages at the level of information blocks. I think that the parsers of the future will be visual. Problems when parsing HTML data - the use of JavaScript / AJAX / asynchronous loading makes it very difficult to write parsers; different HTML rendering engines may produce different DOM trees (in addition, the engines may have bugs that then affect the results of the parsers); Large volumes of data require writing distributed parsers, which entails additional synchronization costs.

It is impossible to clearly identify an approach that will be 100% applicable in all cases, therefore modern libraries for parsing HTML data, as a rule, combine different approaches. For example, HtmlAgilityPack allows you to analyze the DOM tree (use XPath), and Linq to XML technology has recently been supported. Data Extracting SDK uses DOM tree analysis, contains a set of additional methods for parsing strings, and also allows you to use Linq technology for queries in the DOM of the page model.

Today, the absolute leader for parsing HTML data for dotnetters is the HtmlAgilityPack library, but just for fun, you can look at other libraries.

As a rule, mobile terminals are sold without any application software that allows them to recognize barcodes of goods, accumulate them, compare them with the invoice and upload them to a PC. To use the terminal in useful activities, Cleverence Soft offers a special version of the Mobile SMARTS client for TSD and a simple program for exchanging data with TSD for PC. The program converts regular Excel or CSV files into a format that the terminal program can understand and vice versa.

The universal program is intended primarily for non-1C accounting systems. For 1C:Enterprise, Cleverence Soft offers separate sets of programs called data collection terminal drivers.

TSD software allows you to create documents, scan barcodes, view lists of values, and enter a variety of different data. The PC program supplied with the application software allows you to convert data from the TSD into an Excel file of the required format in one click. Data on inventory, internal control, accounting, etc. can be easily collected, converted into Excel and sent to the manager by email. Real-world applications include:

  • Conducting a quick inventory of inventory balances;
  • collection of purchase orders in stores and points of sale;
  • control of goods delivery;
  • vehicle control: issuing work orders, access control, execution control;
  • access control at gates and at checkpoints;
  • collection of shipment data;
  • barcoded accounting in a small warehouse;
  • barcoded accounting at the address storage warehouse;
  • collecting orders at a simple or address warehouse;
  • property inventory;
  • library control;
  • and much more.

Thanks to the use of the Mobile SMARTS platform, the program includes a development tool that allows you to change the logic of document processing and the user interface of the TSD.

The following configurations come with the program to the terminal:

  • Barcode collection: allows you to simply scan items individually or by entering quantities;
  • Simple warehouse: acceptance, shipment, return and inventory without taking into account cells or storage locations and without the ability to use nested containers marked with a barcode (pallets, trays, boxes with a unique number);
  • Addressed storage warehouse: acceptance, shipment, return and inventory taking into account cells or storage locations, but without the ability to use nested containers marked with a barcode (pallets, trays, boxes with a unique number). For each product, the terminal requests a storage location;
  • Container warehouse for address storage: acceptance, shipment, return and inventory taking into account cells or storage locations and the ability to use nested containers marked with a barcode (pallets, trays, boxes with a unique number). For each product, the terminal requests the storage location and container number, and allows you to view the layout of the containers.

The demo version is fully functional and allows you to use directories and documents of any size, with the exception that when exchanging data from documents, only the first three lines are copied.

A lot of people engaged in various activities on the Internet are daily faced with the need to collect and analyze data from various Internet resources. Sources of collection may be stores, bulletin boards, exchanges, websites, groups on social networks, blogs, news feeds, search engines, catalogs, etc.
Every day, millions of gigabytes of various information are collected and processed. Tens of thousands of people work on this, spending millions of dollars and thousands of hours collecting and processing data. There are thousands of different tools for collecting and analyzing information from the web, databases and files.

Using automation for data collection and analysis will save you time and money.

One of the means of automating the collection (parsing) and analysis of information from the network is the Human Emulator program.
Unlike other programs for collecting (parsing) data, Human Emulator does not limit you in any way. In addition to the ability to create new solutions based on the functionality built into the program, you can use ready-made solutions written in PHP or C#. The wide functionality of the program plus the ability to use solutions written in PHP or C# allow you to solve problems of any complexity and create not just parsers (collectors) or analyzer processors, but entire full-cycle systems that will produce the final result: publication of collected and processed materials in stores or on websites, in social networking groups, on bulletin boards, in catalogs, etc.

Human Emulator works with databases, with files of various formats (csv, xml, txt, etc.), with sites made both on the basis of popular cms, such as joomla, wordpress, and with simple sites written in php or html. If necessary, you can perform auto-registration at the collection source, use proxies or socks.

Here are examples of ready-made collection (parsing) solutions that you can find on our website.

The data collection stage is one of the key ones when it comes to conducting a clinical trial. Correctly collected, properly formatted information about patients can significantly facilitate subsequent statistical processing. In an ideal database, each accounting item is represented as a variable that has the correct format, which makes it easy to transfer the data to special statistical programs such as IBM SPSS or STATISTICA.

Numerous requirements for the organization of a database, compliance with which is necessary to ensure the fundamental possibility of its subsequent statistical processing, are outlined by us in the form of the following recommendations.

Some difficulties in creating a database arise when input is carried out from several workstations. For example, a researcher asks his colleagues - doctors or nurses - to help with data entry. In this case, information entered from different places should form a single database, ultimately forming a common table. In turn, the researcher has the opportunity to download the current database at any time.

In order to make the applicant’s life already full of worries a little easier, according to the technical specifications of the editors of the Internet portal site, a professional team of programmers developed a special program called “Medstatistics”, which allows collecting clinical data in accordance with the research protocol.

The data collection program "Medstatistics" provides solutions to the following tasks:


  • Creating and editing a database from any device(computer, tablet, smartphone) connected to the Internet
  • Security database compatibility with the most common statistical programs IBM SPSS and STATISTICA
  • Fast data entry thanks to availability drop-down lists and checkboxes
  • Unloading the database at any time in .xls format (for working in Microsoft Excel)
  • Mode support parallel input when data is entered into the database simultaneously from several workstations
  • Automated generation from the primary documentation database- individual registration cards containing all information about a particular patient previously entered into the database, in .doc or .xls format for printing and submission to the dissertation council

An important feature of the "Medstatistics" program is protection of personal information provided by setting access rights to the database.

The program is distinguished by simple, intuitive interface with a minimum of user settings.

How to purchase and install the "Medstatistics" program for collecting clinical data:

  • The "Medstatistics" program is a cloud service owned by the editors of the Internet portal site and is provided to users for an indefinite lease. Termination of access to the program and physical removal of the database from the server is possible only after completion of the study with the written consent of the user.

  • The cost of perpetual rental of the "Medstatistics" program is 15,000 rubles for the implementation of one form of data collection with up to 30 indicators. The selling price of each indicator in excess of the specified volume is 200 rubles. Implementation of additional forms for data collection is in progress with 50% discount.

  • We we work without prepayment, so the calculation occurs only after the program has been successfully launched on the customer’s devices.

  • The rental price includes lifetime access the researcher and persons identified by him to the database with the ability to replenish and edit it, technical support"Medstatistics" program during the entire period of operation, customizing forms for data entry, education working with the program.

  • When installing and configuring the program, we are ready free consultation the customer on issues related to the organization of the study and subsequent statistical processing of data.

  • If a database is created in the "Medstatistics" program, the researcher is provided with 10% discount for statistical analysis.

If the conditions we offer suit you and you want to discuss the program for collecting data "Medstatistics" in more detail, please call us at: