AI Coding Nyt

AI Coding Nyt — independent reviews, comparisons, pricing and step-by-step guides on Aizhi.

  • Common data model

    Common data model

    A common data model (CDM) can refer to any standardised data model which allows for data and information exchange between different applications and data sources. Common data models aim to standardise logical infrastructure so that related applications can "operate on and share the same data", and can be seen as a way to "organize data from many sources that are in different formats into a standard structure". A common data model has been described as one of the components of a "strong information system". A standardised common data model has also been described as a typical component of a well designed agile application besides a common communication protocol. Providing a single common data model within an organisation is one of the typical tasks of a data warehouse. == Examples of common data models == === Border crossings === X-trans.eu was a cross-border pilot project between the Free State of Bavaria (Germany) and Upper Austria with the aim of developing a faster procedure for the application and approval of cross-border large-capacity transports. The portal was based on a common data model that contained all the information required for approval. === Climate data === The Climate Data Store Common Data Model is a common data model set up by the Copernicus Climate Change Service for harmonising essential climate variables from different sources and data providers. === General information technology === Within service-oriented architecture, S-RAMP is a specification released by HP, IBM, Software AG, TIBCO, and Red Hat which defines a common data model for SOA repositories as well as an interaction protocol to facilitate the use of common tooling and sharing of data. Content Management Interoperability Services (CMIS) is an open standard for inter-operation of different content management systems over the internet, and provides a common data model for typed files and folders used with version control. The NetCDF software libraries for array-oriented scientific data implements a common data model called the NetCDF Java common data model, which consists of three layers built on top of each other to add successively richer semantics. === Health === Within genomic and medical data, the Observational Medical Outcomes Partnership (OMOP) research program established under the U.S. National Institutes of Health has created a common data model for claims and electronic health records which can accommodate data from different sources around the world. PCORnet, which was developed by the Patient-Centered Outcomes Research Institute, is another common data model for health data including electronic health records and patient claims. The Sentinel Common Data Model was initially started as Mini-Sentinel in 2008. It is used by the Sentinel Initiative of the USA's Food and Drug Administration. The Generalized Data Model was first published in 2019. It was designed to be a stand-alone data model as well as to allow for further transformation into other data models (e.g., OMOP, PCORNet, Sentinel). It has a hierarchical structure to flexibly capture relationships among data elements. The JANUS clinical trial data repository also provides a common data model which is based on the SDTM standard to represent clinical data submitted to regulatory agencies, such as tabulation datasets, patient profiles, listings, etc. === Logistics === SX000i is a specification developed jointly by the Aerospace and Defence Industries Association of Europe (ASD) and the American Aerospace Industries Association (AIA) to provide information, guidance and instructions to ensure compatibility and the commonality. The associated SX002D specification contains a common data model. === Microsoft Common Data Model === The Microsoft Common Data Model is a collection of many standardised extensible data schemas with entities, attributes, semantic metadata, and relationships, which represent commonly used concepts and activities in various businesses areas. It is maintained by Microsoft and its partners, and is published on GitHub. Microsoft's Common Data Model is used amongst others in Microsoft Dataverse and with various Microsoft Power Platform and Microsoft Dynamics 365 services. === Rail transport === RailTopoModel is a common data model for the railway sector. === Other === There are many more examples of various common data models for different uses published by different sources.

    Read more →
  • Clone tool

    Clone tool

    The clone tool, as it is known in Adobe Photoshop, Inkscape, GIMP, and Corel PhotoPaint, is used in digital image editing to replace information for one part of a picture with information from another part. In other image editing software, its equivalent is sometimes called a rubber stamp tool or a clone brush. == Applications == The clone tool can remove objects by copying a nearby background. The user selects a matching location as the source, then paints over the element to be hidden. A typical use for the tool is in object removal – more colloquially, "airbrushing" or "photoshopping" out an unwanted part of the image. If a part of an image is removed simply by cutting it out, then a hole is left in the background. The Clone tool can fill in this hole convincingly with a copy of the existing background from elsewhere in the image. A common use for this tool is to retouch skin, particularly in portraits, to remove blemishes and make skin tones more even. Cloning can also be used to remove other unwanted elements, such as telephone wires, an unwanted bird in the sky, and the like. A more automated method of object removal uses texture synthesis to fill in gaps. Of these, patch-based texture synthesis or "image quilting" is essentially an automated application of the clone tool, choosing the optimal source area so as to patch over with a minimal seam. In some cases, the undesired object is mixed with the remainder of the image, and a simple circular brush, even with feathering, would not work. For these cases, some programs allow an object to be selected by color/outline so other areas are not affected. Other programs allow edge/color sensitive brushes to deal with such objects. == Healing tool == A similar tool is the healing tool, which occurs in variants such as the healing brush or spot healing tool. These incorporate the existing texture, rather than painting it over.

    Read more →
  • Color image pipeline

    Color image pipeline

    An image pipeline or video pipeline is the set of components commonly used between an image source (such as a camera, a scanner, or the rendering engine in a computer game), and an image renderer (such as a television set, a computer screen, a computer printer or cinema screen), or for performing any intermediate digital image processing consisting of two or more separate processing blocks. An image/video pipeline may be implemented as computer software, in a digital signal processor, on an FPGA, or as fixed-function ASIC. In addition, analog circuits can be used to do many of the same functions. Typical components include image sensor corrections (including debayering or applying a Bayer filter), noise reduction, image scaling, gamma correction, image enhancement, colorspace conversion (between formats such as RGB, YUV or YCbCr), chroma subsampling, framerate conversion, image compression/video compression (such as JPEG), and computer data storage/data transmission. Typical goals of an imaging pipeline may be perceptually pleasing end-results, colorimetric precision, a high degree of flexibility, low cost/low CPU utilization/long battery life, or reduction in bandwidth/file size. Some functions may be algorithmically linear. Mathematically, those elements can be connected in any order without changing the end-result. As digital computers use a finite approximation to numerical computing, this is in practice not true. Other elements may be non-linear or time-variant. For both cases, there is often one or a few sequences of components that makes sense for optimum precision and minimum hardware-cost/CPU-load.

    Read more →
  • SWILE

    SWILE

    SWILE (formerly: Lunchr) is a French app-based company that focuses on improving the employee experience. Among others, the platform offers meal vouchers, gift vouchers, mobility vouchers, and business travel solutions. In March 2020, it was renamed SWILE and entered the lunch break and meal voucher market. == History == The company was founded as Lunchr by Loïc Soubeyrand in 2016. Originally, Lunchr was an app for pre-ordering lunch on the spot or to go. In January 2017, the company raised €2.5 million in seed funding from Daphni. In 2018, the company raised €11 million (series A) from Idinvest, followed by another €30 million in February 2019 (series B), notably from Index Ventures and Kima Ventures. In January 2020, Lunchr became one of the first startups to join the French Tech 120. A few months later, in March, Lunchr diversified its services, adding team life management tools and changing its brand name to Swile. In June 2020, the company raised €70 million more in a new round of financing (Series C) from the same investors and the BPI. In November 2020, Swile acquired Briq, a startup specializing in employee engagement. In January 2021, Swile won a tender with Carrefour and distributed 62,000 Swile cards to its employees. In early October 2021, a new $200 million (€175 million) fundraising round, in which Japanese Softbank joined other investors, allowed Swile to capitalize on $1 billion. President Emmanuel Macron cited the company as "a further proof that FrenchTech is at the forefront internationally." In May 2022, the company acquired the travel management start-up Okarito for €6 million. == Overview == Swile operates in two countries (France and Brazil) and has a total of 1000 employees, 5.5 million users and 85,000 corporate customers, including Carrefour, Le Monde, JCDECAUX, PSG, Airbnb, Spotify, Red Bull, and TikTok in the private sector, as well as numerous local authorities and ministerial references in the public sector.

    Read more →
  • Zero-shot learning

    Zero-shot learning

    Zero-shot learning (ZSL) is a problem setup in deep learning where, at test time, a learner observes samples from classes which were not observed during training, and needs to predict the class that they belong to. The name is a play on words based on the earlier concept of one-shot learning, in which classification can be learned from only one, or a few, examples. Zero-shot methods generally work by associating observed and non-observed classes through some form of auxiliary information, which encodes observable distinguishing properties of objects. For example, given a set of images of animals to be classified, along with auxiliary textual descriptions of what animals look like, an artificial intelligence model which has been trained to recognize horses, but has never been given a zebra, can still recognize a zebra when it also knows that zebras look like striped horses. This problem is widely studied in computer vision, natural language processing, and machine perception. == Background and history == The first paper on zero-shot learning in natural language processing appeared in a 2008 paper by Chang, Ratinov, Roth, and Srikumar, at the AAAI'08, but the name given to the learning paradigm there was dataless classification. The first paper on zero-shot learning in computer vision appeared at the same conference, under the name zero-data learning. The term zero-shot learning itself first appeared in the literature in a 2009 paper from Palatucci, Hinton, Pomerleau, and Mitchell at NIPS'09. This terminology was repeated later in another computer vision paper and the term zero-shot learning caught on, as a take-off on one-shot learning that was introduced in computer vision years earlier. In computer vision, zero-shot learning models learned parameters for seen classes along with their class representations and rely on representational similarity among class labels so that, during inference, instances can be classified into new classes. In natural language processing, the key technical direction developed builds on the ability to "understand the labels"—represent the labels in the same semantic space as that of the documents to be classified. This supports the classification of a single example without observing any annotated data, the purest form of zero-shot classification. The original paper made use of the Explicit Semantic Analysis (ESA) representation but later papers made use of other representations, including dense representations. This approach was also extended to multilingual domains, fine entity typing and other problems. Moreover, beyond relying solely on representations, the computational approach has been extended to depend on transfer from other tasks, such as textual entailment and question answering. The original paper also points out that, beyond the ability to classify a single example, when a collection of examples is given, with the assumption that they come from the same distribution, it is possible to bootstrap the performance in a semi-supervised like manner (or transductive learning). Unlike standard generalization in machine learning, where classifiers are expected to correctly classify new samples to classes they have already observed during training, in ZSL, no samples from the classes have been given during training the classifier. It can therefore be viewed as an extreme case of domain adaptation. == Prerequisite information for zero-shot classes == Naturally, some form of auxiliary information has to be given about these zero-shot classes, and this type of information can be of several types. Learning with attributes: classes are accompanied by pre-defined structured description. For example, for bird descriptions, this could include "red head", "long beak". These attributes are often organized in a structured compositional way, and taking that structure into account improves learning. While this approach was used mostly in computer vision, there are some examples for it also in natural language processing. Learning from textual description. As pointed out above, this has been the key direction pursued in natural language processing. Here class labels are taken to have a meaning and are often augmented with definitions or free-text natural-language description. This could include for example a wikipedia description of the class. Class-class similarity. Here, classes are embedded in a continuous space. A zero-shot classifier can predict that a sample corresponds to some position in that space, and the nearest embedded class is used as a predicted class, even if no such samples were observed during training. == Generalized zero-shot learning == The above ZSL setup assumes that at test time, only zero-shot samples are given, namely, samples from new unseen classes. In generalized zero-shot learning, samples from both new and known classes, may appear at test time. This poses new challenges for classifiers at test time, because it is very challenging to estimate if a given sample is new or known. Some approaches to handle this include: a gating module, which is first trained to decide if a given sample comes from a new class or from an old one, and then, at inference time, outputs either a hard decision, or a soft probabilistic decision a generative module, which is trained to generate feature representation of the unseen classes—a standard classifier can then be trained on samples from all classes, seen and unseen. == Domains of application == Zero shot learning has been applied to the following fields: image classification semantic segmentation image generation object detection natural language processing computational biology abstract reasoning

    Read more →
  • Trello

    Trello

    Trello is a web-based, kanban-style list-making application developed by Atlassian. Created in 2011 by Fog Creek Software, it was spun out to form the basis of a separate company in New York City in 2014 and sold to Atlassian in January 2017. == History == The name Trello is derived from the word trellis, which had been a code name for the project at its early stages. Trello was released at a TechCrunch event by Fog Creek founder Joel Spolsky. In September 2011 Wired magazine named the application one of "The 7 Coolest Startups You Haven't Heard of Yet". Lifehacker said "it makes project collaboration simple and kind of enjoyable". In 2014, it raised US$10.3 million in funding from Index Ventures and Spark Capital. Prior to its acquisition, Trello had sold 22% of its shares to investors, with the remaining shares held by founders Michael Pryor and Joel Spolsky. In May 2016, Trello claimed it had more than 1.1 million daily active users and 14 million total signups. In May 2015, Trello expanded internationally with localized interfaces for Brazil, Germany, and Spain. In 2016 Trello launched the Power-Up platform, allowing 3rd party developers to build and distribute extensions known as Power-Ups to Trello. Initial integrations included Zendesk, SurveyMonkey and Giphy. By January 2022 there were a total of 247 power-ups listed in the Power-Up directory. On 9 January 2017, Atlassian announced its intent to acquire Trello for $425 million. The transaction was made with $360 million in cash and $65 million in shares and options. In December 2018, Trello announced its acquisition of Butler, a company that developed a leading power-up for automating tasks within a Trello board. Trello announced 35 million users in March 2019 and 50 million users in October 2019. In 2020 Craig Jones, then cybersecurity operations director at Sophos, found that the company exposed the personally identifiable information (PII) data of its users, exposed through public Trello boards; the researcher first tweeted about this issue in the year 2018. On 16 January 2024 Trello suffered a data breach containing over 15 million unique email addresses, names and usernames, when the data was posted on a popular hacking forum. The data was obtained by enumerating a publicly accessible resource using email addresses from previous breach corpuses; it was then added on 22 January 2024 to the famous website collecting data breaches "Have I Been Pwned?". == Uses == Users can create task boards with different columns and move the tasks between them. Typically columns include task statuses such as To Do, In Progress, Done. The tool can be used for personal and business purposes including real estate management, software project management, school bulletin boards, lesson planning, accounting, web design, gaming, and law office case management. == Architecture == According to a Fog Creek blog post in January 2012, the client was a thin web layer which downloads the main app, written in CoffeeScript and compiled to minified JavaScript, using Backbone.js, HTML5 .pushState(), and the Mustache templating language. The server was built on top of MongoDB, Node.js and a modified version of Socket.io. == Reception == On 26 January 2017, PC Magazine gave Trello a 3.5 / 5 rating, calling it "flexible" and saying that "you can get rather creative", while noting that "it may require some experimentation to figure out how to best use it for your team and the workload you manage."

    Read more →
  • Stride (software)

    Stride (software)

    Stride was a cloud-based team business communication and collaboration tool, launched by Atlassian on 7 September 2017 to replace the cloud-based version of HipChat. Stride software was available to download onto computers running Windows, Mac or Linux, as well as Android, iOS smartphones, and tablets. Stride was bought by Atlassian's competitor Slack Technologies and was discontinued on February 15, 2019. The features of Stride include chat rooms, one-on-one messaging, file sharing, 5 GB of file storage, group voice and video calling, built-in collaboration tools, and up to 25,000 of searchable message history. Premium features include unlimited file storage, users, group chat rooms, file sharing and storage, apps, and history retention. The premium version, priced at $3/user/month, also includes advanced meeting functionality like group screen sharing, remote desktop control, and dial-in/dial-out capabilities. Stride offered integrations with Atlassian's other products as well as other third-party applications listed in the Atlassian Marketplace, such as GitHub, Giphy, Stand-Bot and Google Calendar. Stride offered additional features beyond messaging to improve efficiency and productivity. It aimed to reduce collaboration noise by introducing a "focus" mode, and eliminates the divisions between text chat, voice meetings, and videoconferencing, by simplifying transitioning between these modes in the same channel. On July 26, 2018, Atlassian announced that HipChat and Stride would be discontinued February 15, 2019, and that it had reached a deal to sell their intellectual property to Slack. Slack paid an undisclosed amount over three years to assume the user bases of the services, while Atlassian took a minority investment in Slack. The companies also announced a commitment to work on integration of Slack with Atlassian services.

    Read more →
  • Color vision

    Color vision

    Color vision (CV), a feature of visual perception, is an ability to perceive differences between light composed of different frequencies independently of light intensity. Color perception is a part of the larger visual system and is mediated by a complex process between neurons that begins with differential stimulation of different types of photoreceptors by light entering the eye. Those photoreceptors then emit outputs that are propagated through many layers of neurons ultimately leading to higher cognitive functions in the brain. Color vision is found in many animals and is mediated by similar underlying mechanisms with common types of biological molecules and a complex history of the evolution of color vision within different animal taxa. In primates, color vision may have evolved under selective pressure for a variety of visual tasks including the foraging for nutritious young leaves, ripe fruit, and flowers, as well as detecting predator camouflage and emotional states in other primates. == Wavelength == Isaac Newton discovered that white light after being split into its component colors when passed through a dispersive prism could be recombined to make white light by passing them through a different prism. The visible light spectrum ranges from about 380 to 740 nanometers. Spectral colors (colors that are produced by a narrow band of wavelengths) such as red, orange, yellow, green, cyan, blue, and violet can be found in this range. These spectral colors do not refer to a single wavelength, but rather to a set of wavelengths: red, 625–740 nm; orange, 590–625 nm; yellow, 565–590 nm; green, 500–565 nm; cyan, 485–500 nm; blue, 450–485 nm; violet, 380–450 nm. Wavelengths longer or shorter than this range are called infrared or ultraviolet, respectively. Humans cannot generally see these wavelengths, but other animals may. === Hue detection === Sufficient differences in wavelength cause a difference in the perceived hue; the just-noticeable difference in wavelength varies from about 1 nm in the blue-green and yellow wavelengths to 10 nm and more in the longer red and shorter blue wavelengths. Although the human eye can distinguish up to a few hundred hues, when those pure spectral colors are mixed together or diluted with white light, the number of distinguishable chromaticities can be much higher. In very low light levels, vision is scotopic: light is detected by rod cells of the retina. Rods are maximally sensitive to wavelengths near 500 nm and play little, if any, role in color vision. In brighter light, such as daylight, vision is photopic: light is detected by cone cells which are responsible for color vision. Cones are sensitive to a range of wavelengths, but are most sensitive to wavelengths near 555 nm. Between these regions, mesopic vision comes into play and both rods and cones provide signals to the retinal ganglion cells. The shift in color perception from dim light to daylight gives rise to differences known as the Purkinje effect. The perception of "white" is formed by the entire spectrum of visible light, or by mixing colors of just a few wavelengths in animals with few types of color receptors. In humans, white light can be perceived by combining wavelengths such as red, green, and blue, or just a pair of complementary colors such as blue and yellow. === Non-spectral colors === There are a variety of colors in addition to spectral colors and their hues. These include grayscale colors, shades of colors obtained by mixing grayscale colors with spectral colors, violet-red colors, impossible colors, and metallic colors. Grayscale colors include white, gray, and black. Rods contain rhodopsin, which reacts to light intensity, providing grayscale coloring. Shades include colors such as pink or brown. Pink is obtained from mixing red and white. Brown may be obtained from mixing orange with gray or black. Navy is obtained from mixing blue and black. Violet-red colors include hues and shades of magenta. The light spectrum is a line on which violet is one end and the other is red, and yet we see hues of purple that connect those two colors. Impossible colors are a combination of cone responses that cannot be naturally produced. For example, medium cones cannot be activated completely on their own; if they were, we would see a 'hyper-green' color. == Dimensionality == Color vision is categorized foremost according to the dimensionality of the color gamut, which is defined by the number of primaries required to represent the color vision. This is generally equal to the number of photopsins expressed: a correlation that holds for vertebrates but not invertebrates. The common vertebrate ancestor possessed four photopsins (expressed in cones) plus rhodopsin (expressed in rods), so was tetrachromatic. However, many vertebrate lineages have lost one or many photopsin genes, leading to lower-dimension color vision. The dimensions of color vision range from 1-dimensional and up: == Physiology of color perception == Perception of color begins with specialized retinal cells known as cone cells. Cone cells contain different forms of opsin – a pigment protein – that have different spectral sensitivities. Humans contain three types, resulting in trichromatic color vision. Each individual cone contains pigments composed of opsin apoprotein covalently linked to a light-absorbing prosthetic group: either 11-cis-hydroretinal or, more rarely, 11-cis-dehydroretinal. The cones are conventionally labeled according to the ordering of the wavelengths of the peaks of their spectral sensitivities: short (S), medium (M), and long (L) cone types. These three types do not correspond well to particular colors as we know them. Rather, the perception of color is achieved by a complex process that starts with the differential output of these cells in the retina and which is finalized in the visual cortex and associative areas of the brain. For example, while the L cones have been referred to simply as red receptors, microspectrophotometry has shown that their peak sensitivity is in the greenish-yellow region of the spectrum. Similarly, the S cones and M cones do not directly correspond to blue and green, although they are often described as such. The RGB color model, therefore, is a convenient means for representing color but is not directly based on the types of cones in the human eye. The peak response of human cone cells varies, even among individuals with typical color vision; in some non-human species this polymorphic variation is even greater, and it may well be adaptive. === Theories === Two complementary theories of color vision are the trichromatic theory and the opponent process theory. The trichromatic theory, or Young–Helmholtz theory, proposed in the 19th century by Thomas Young and Hermann von Helmholtz, posits three types of cones preferentially sensitive to blue, green, and red, respectively. Others have suggested that the trichromatic theory is not specifically a theory of color vision but a theory of receptors for all vision, including color but not specific or limited to it. Equally, it has been suggested that the relationship between the phenomenal opponency described by Ewald Hering and the physiological opponent processes are not straightforward (see below), making of physiological opponency a mechanism that is relevant to the whole of vision, and not just to color vision alone. Hering proposed the opponent process theory in 1872. It states that the visual system interprets color in an antagonistic way: red vs. green, blue vs. yellow, black vs. white. Both theories are generally accepted as valid, describing different stages in visual physiology, visualized in the adjacent diagram. Green–magenta and blue–yellow are scales with mutually exclusive boundaries. In the same way that there cannot exist a "slightly negative" positive number, a single eye cannot perceive a bluish-yellow or a reddish-green. Although these two theories are both currently widely accepted theories, past and more recent work has led to criticism of the opponent process theory, stemming from a number of what are presented as discrepancies in the standard opponent process theory. For example, the phenomenon of an after-image of complementary color can be induced by fatiguing the cells responsible for color perception, by staring at a vibrant color for a length of time, and then looking at a white surface. This phenomenon of complementary colors shows that cyan, rather than green, is the complement of red, and that magenta, rather than red, is the complement of green. It therefore also shows that the reddish-green color supposed to be impossible by opponent process theory is actually the color yellow. Although this phenomenon is more readily explained by the trichromatic theory, explanations for the discrepancy may include alterations to the opponent process theory, such as redefining the opponent colors as red vs. cyan, to reflect this effect. Despite such criticis

    Read more →
  • Artificial intelligence in spirituality

    Artificial intelligence in spirituality

    Some users of artificial intelligence (AI) technologies, especially chatbots, may develop beliefs that AI has or can attain supernatural or spiritual powers. AI models such as ChatGPT are turned to for fortune telling, mysticism and remote viewing. Recent and sudden advances in large language models have led to folk myths about their origin or capabilities, as well as their deification or worship by some users. Tucker Carlson has made similar claims, including directly to Sam Altman. Pope Leo XIV advised priests against using LLM models when it came to the creation of sermons.

    Read more →
  • Abiquo Enterprise Edition

    Abiquo Enterprise Edition

    Abiquo Hybrid Cloud Management Platform is a web-based cloud computing software platform developed by Abiquo. Written entirely in Java, it is used to build, integrate and manage public and private clouds in homogeneous environments. Users can deploy and manage servers, storage system and network and virtual devices. It also supports LDAP integration. == Hypervisors == Abiquo supports five hypervisor systems. VMware ESXi Microsoft Hyper-V Citrix XenServer Oracle VM Server for x86 KVM From version 3.1, it also supports multiple public cloud providers: Amazon AWS Rackspace Google Compute Engine HP Cloud ElasticHosts DigitalOcean Abiquo version 3.2 added: Microsoft Azure Abiquo version 3.4 added: Support for Docker hosts, adding multi-tenant networking, storage management and private registry management for Docker SoftLayer CloudSigma Later versions continued to add features including autoscaling on any cloud, integration to VMware NSX and OpenStack Neutron for software defined networking, guest config with cloud-init and integrated monitoring driving guest automation. == Storage services == Abiquo supports any vendor for hypervisor storage, and also supports tiered storage pools, enabling storage-as-a-service from specific vendors and technologies including: NFS Generic iSCSI NetApp Nexenta == SAAS version == In April 2014 Abiquo launched Abiquo anyCloud, a SAAS version of the Abiquo Hybrid Cloud Platform software. This version lets users manage public cloud resources from: Amazon AWS Microsoft Azure IBM SoftLayer DigitalOcean Rackspace Open Cloud (an OpenStack cloud) HP Public Cloud (an OpenStack cloud) Google Compute Engine ElasticHosts Additional security and process features include workflow, to have an enterprise administrator electronically sign off on changes, an audit trail of activity and the ability to share cloud accounts among and enterprise team in a secure way. == Reviews and awards == Finalist for the 2015 Cloud Awards Finalist for the 2015 UK Cloud Awards in the category Cloud Management Product of the Year EMA Radar for Private Cloud platforms 2013 Global Telecoms Business Innovation Summit and Awards 2013 (with Interoute) EuroCloud UK Awards

    Read more →
  • Netvibes

    Netvibes

    Netvibes is a French brand of Dassault Systèmes that previously ran a web service offering a dashboard and feed reader. Currently, the company offers business intelligence tools. == History == === 2005–2012 === Founded in 2005 by Tariq Krim, the company provided software for personalized dashboards for real-time monitoring, social analytics, knowledge sharing, and decision support. === 2012–present === On February 9, 2012, Dassault Systèmes announced the acquisition of Netvibes. As of 2024, Netvibes also contains the operations of two other software companies acquired by Dassault Systèmes: Exalead: founded in 2000 by François Bourdoncle, the company provided search platforms and search-based applications for consumer and business users. On June 9, 2010, Dassault Systèmes acquired the company. Proxem: Founded in 2007 by François-Régis Caumartin, the company provided AI-powered semantic processing software and services. On June 23, 2020, Dassault Systèmes acquired Proxem and integrated its technology into the 3DEXPERIENCE® platform to complement its information intelligence applications. Dassault Systèmes announced in April 2025 that Netvibes would retire its standalone web service offering on June 2, 2025. == Activities == Brand monitoring – to track clients, customers and competitors across media sources all in one place, analyze live results with third party reporting tools, and provide media monitoring dashboards for brand clients. E-reputation management – to visualize real-time online conversations and social activity online feeds, and track new trending topics. Product marketing – to create interactive product microsites, with drag-and-drop publishing interface. Community portals – to engage online communities Personalized workspaces – to gather all essential company updates to support specific divisions (e.g. sales, marketing, human resources) and localizations. The software was a multi-lingual Ajax-based start page or web portal. It was organized into tabs, with each tab containing user-defined modules. Built-in Netvibes modules included an RSS/Atom feed reader, local weather forecasts, a calendar supporting iCal, bookmarks, notes, to-do lists, multiple searches, support for POP3, IMAP4 email as well as several webmail providers including Gmail, Yahoo! Mail, Hotmail, and AOL Mail, Box.net web storage, Delicious, Meebo, Flickr photos, podcast support with a built-in audio player, and several others. A page could be personalized further through the use of existing themes or by creating personal theme. Customized tabs, feeds and modules can be shared with others individually or via the Netvibes Ecosystem. For privacy reasons, only modules with publicly available content could be shared.

    Read more →
  • Apache Drill

    Apache Drill

    Apache Drill is an open-source software framework that supports data-intensive distributed applications for interactive analysis of large-scale datasets. Built chiefly by contributions from developers from MapR, Drill is inspired by Google's Dremel system. Drill is an Apache top-level project. Drill supports a variety of NoSQL databases and file systems, including Alluxio, HBase, MongoDB, MapR-DB, HDFS, MapR-FS, Amazon S3, Azure Blob Storage, Google Cloud Storage, Swift, NAS and local files. A single query can join data from multiple datastores. Drill's datastore-aware optimizer automatically restructures a query plan to leverage the datastore's internal processing capabilities. In addition, Drill supports data locality, if Drill and the datastore are on the same nodes. Tom Shiran is the founder of the Apache Drill Project. It was designated an Apache Software Foundation top-level project in December 2016. == Features == One explicitly stated design goal is that Drill is able to scale to 10,000 servers or more and to be able to process petabytes of data and trillions of records in seconds. Schema-free JSON document model similar to MongoDB and Elasticsearch, without requiring a formal schema to be declared Industry-standard APIs: ANSI SQL, ODBC/JDBC, RESTful APIs Extremely user and developer friendly Pluggable architecture enables connectivity to multiple datastores Version 1.9 added dynamic user-defined functions Version 1.11 added cryptographic-related functions and PCAP file format support == Back-end support == Drill is primarily focused on non-relational datastores, including Apache Hadoop text files, NoSQL, and cloud storage. A notable feature also includes in situ querying of local JSON and Apache Parquet files. Some additional datastores that it supports include: All Hadoop distributions (HDFS API 2.3+), including Apache Hadoop, MapR, CDH and Amazon EMR NoSQL: MongoDB, Apache HBase, Apache Cassandra Online Analytical Processing: Apache Kudu, Apache Druid, OpenTSDB Cloud storage: Amazon S3, Google Cloud Storage, Azure Blob Storage, Swift, IBM Cloud Object Storage Diverse data formats, including Apache Avro, Apache Parquet and JSON RDBMs storage plugins (Using JDBC to connect to MySQL, PostgreSQL, and others) A new datastore can be added by developing a storage plugin. Drill's "schema-free" JSON data model enables it to query non-relational datastores in-situ . == Front-end support == Drill itself can be queried via JDBC, ODBC, or REST through a variety of methods and languages including Python and Java. The default install includes a web interface allowing end-users to execute ANSI SQL directly and export data tables as CSV files without any programming. The dashboard library, Apache Superset, is particularly well suited for visualization of data queried with Drill.

    Read more →
  • Self-supervised learning

    Self-supervised learning

    Self-supervised learning (SSL) is a paradigm in machine learning where a model is trained on a task using the data itself to generate supervisory signals, rather than relying on externally-provided labels. In the context of neural networks, self-supervised learning aims to leverage inherent structures or relationships within the input data to create meaningful training signals. SSL tasks are designed so that solving them requires capturing essential features or relationships in the data. The input data is typically augmented or transformed in a way that creates pairs of related samples, where one sample serves as the input, and the other is used to formulate the supervisory signal. This augmentation can involve introducing noise, cropping, rotation, or other transformations. Self-supervised learning more closely imitates the way humans learn to classify objects. During SSL, the model learns in two steps. First, the task is solved based on an auxiliary or pretext classification task using pseudo-labels, which help to initialize the model parameters. Next, the actual task is performed with supervised or unsupervised learning. Self-supervised learning has produced promising results in recent years, and has found practical application in fields such as audio processing, and is being used by Facebook and others for speech recognition. == Pseudo-labels == Pseudo-labels are automatically generated labels that a model assigns to unlabeled data based on its own predictions. They are widely used in self-supervised and semi-supervised learning, where ground-truth annotations are limited or unavailable. By treating predicted labels as surrogate ground truth, learning algorithms can make use of large quantities of unlabeled data in the training process. Pseudo-labeling also plays an important role in systems that must adapt to concept drift, where the statistical properties of the data change over time. In these scenarios, the model may detect that an incoming instance deviates from previously learned behavior. The system then generates a classification result for that instance, and this predicted class is used as a pseudo-label for updating or retraining model components that are becoming outdated. This approach enables continuous adaptation in dynamic environments without requiring manual annotation. In many adaptive learning pipelines, pseudo-labels are chosen when the classifier produces sufficiently confident predictions, reducing the risk of propagating errors. These pseudo-labeled instances are then incorporated into training to refresh or evolve the model's understanding of emerging data patterns, particularly when existing components show signs of “aging” due to drift or distributional shifts. This strategy reduces reliance on manual labeling while helping maintain long-term model performance. == Types == === Autoassociative self-supervised learning === Autoassociative self-supervised learning is a specific category of self-supervised learning where a neural network is trained to reproduce or reconstruct its own input data. In other words, the model is tasked with learning a representation of the data that captures its essential features or structure, allowing it to regenerate the original input. The term "autoassociative" comes from the fact that the model is essentially associating the input data with itself. This is often achieved using autoencoders, which are a type of neural network architecture used for representation learning. Autoencoders consist of an encoder network that maps the input data to a lower-dimensional representation (latent space), and a decoder network that reconstructs the input from this representation. The training process involves presenting the model with input data and requiring it to reconstruct the same data as closely as possible. The loss function used during training typically penalizes the difference between the original input and the reconstructed output (e.g. mean squared error). By minimizing this reconstruction error, the autoencoder learns a meaningful representation of the data in its latent space. === Contrastive self-supervised learning === For a binary classification task, training data can be divided into positive examples and negative examples. Positive examples are those that match the target. For example, if training a classifier to identify birds, the positive training data would include images that contain birds. Negative examples would be images that do not. Contrastive self-supervised learning uses both positive and negative examples. The loss function in contrastive learning is used to minimize the distance between positive sample pairs, while maximizing the distance between negative sample pairs. An early example uses a pair of 1-dimensional convolutional neural networks to process a pair of images and maximize their agreement. Contrastive Language-Image Pre-training (CLIP) allows joint pretraining of a text encoder and an image encoder, such that a matching image-text pair have image encoding vector and text encoding vector that span a small angle (having a large cosine similarity). InfoNCE (Noise-Contrastive Estimation) is a method to optimize two models jointly, based on Noise Contrastive Estimation (NCE). Given a set X = { x 1 , … x N } {\displaystyle X=\left\{x_{1},\ldots x_{N}\right\}} of N {\displaystyle N} random samples containing one positive sample from p ( x t + k ∣ c t ) {\displaystyle p\left(x_{t+k}\mid c_{t}\right)} and N − 1 {\displaystyle N-1} negative samples from the 'proposal' distribution p ( x t + k ) {\displaystyle p\left(x_{t+k}\right)} , it minimizes the following loss function: L N = − E X [ log ⁡ f k ( x t + k , c t ) ∑ x j ∈ X f k ( x j , c t ) ] {\displaystyle {\mathcal {L}}_{\mathrm {N} }=-\mathbb {E} _{X}\left[\log {\frac {f_{k}\left(x_{t+k},c_{t}\right)}{\sum _{x_{j}\in X}f_{k}\left(x_{j},c_{t}\right)}}\right]} === Non-contrastive self-supervised learning === Non-contrastive self-supervised learning (NCSSL) uses only positive examples. Counterintuitively, NCSSL converges on a useful local minimum rather than reaching a trivial solution, with zero loss. For the example of binary classification, it would trivially learn to classify each example as positive. Effective NCSSL requires an extra predictor on the online side that does not back-propagate on the target side. === Joint-Embedding and Predictive Architectures === A major class of self-supervised learning moves beyond contrastive pairs, instead maximizing the agreement between views while preventing collapse through statistical constraints. Rooted in Deep Canonical Correlation Analysis (Deep CCA), this approach includes Joint-Embedding Architectures (JEA) like Barlow Twins and VICReg, which enforce covariance constraints to learn invariant representations without negative sampling. Deep Latent Variable Path Modelling (DLVPM) generalizes this to multimodal systems, using path models to enforce correlation and orthogonality across diverse data types. In 2022 Yann LeCun introduced Joint-Embedding Predictive Architectures (JEPA) as a step towards decision making, reasoning, and autonomous human intelligence in machines, including self-improvement through autonomous learning. Founded in representation learning, LeCun included the concept of a “world model” in JEPA which aims to enable machines to replicate human intellect by providing machines with a concept for the world in which they exist. Unlike autoencoders, JEPAs operate entirely in latent space, avoiding pixel-level noise to focus on semantic structure. Rather than just learning invariance, JEPAs learn by predicting masked latent representations from visible context. JEPA has been applied to domains such as image analysis, audio processing, and motion in images and video. == Comparison with other forms of machine learning == SSL belongs to supervised learning methods insofar as the goal is to generate a classified output from the input. At the same time, however, it does not require the explicit use of labeled input-output pairs. Instead, correlations, metadata embedded in the data, or domain knowledge present in the input are implicitly and autonomously extracted from the data. These supervisory signals, extracted from the data, can then be used for training. SSL is similar to unsupervised learning in that it does not require labels in the sample data. Unlike unsupervised learning, however, learning is not done using inherent data structures. Semi-supervised learning combines supervised and unsupervised learning, requiring only a small portion of the learning data be labeled. In transfer learning, a model designed for one task is reused on a different task. Training an autoencoder intrinsically constitutes a self-supervised process, because the output pattern needs to become an optimal reconstruction of the input pattern itself. However, in current jargon, the term 'self-supervised' often refers to tasks based on a pretext-task training setup

    Read more →
  • Cloud-Based Secure File Transfer

    Cloud-Based Secure File Transfer

    Cloud-Based Secure File Transfer is a managed or hosted file transfer service that provides cloud storage that can be accessed via SSH File Transfer Protocol (SFTP). These services allow secure, reliable file transfers while offering the scalability, redundancy, and high availability of cloud infrastructure. == Technical overview == The evolution of file transfer protocols began with File Transfer Protocol (FTP) and SSH File Transfer Protocol (SFTP). SFTP offered enhanced security through the use of SSH (Secure Shell) encryption, which addressed many of the security concerns associated with traditional FTP. Over time, as businesses increasingly adopted cloud infrastructure, the demand for services that integrate secure file transfer with cloud storage led to the rise of Cloud-Based Secure File Transfer services. These services combine the benefits of secure, encrypted file transfer with the scalability and flexibility of cloud-based storage systems. Traditional on-premises SFTP typically involves setting up and managing physical or virtual servers to handle file transfers. In contrast, Cloud-Based Secure File Transfer utilizes managed cloud infrastructure, such as AWS EC2, Azure VMs, or Google Cloud, to automate scaling, ensure redundancy, and provide high availability. These cloud environments can be configured to automatically scale with demand, enabling businesses to handle large volumes of data transfers without the need for extensive physical hardware. == Features == Scalability and availability: Cloud-Based Secure File Transfer services are inherently scalable, with features like load balancing, multi-region deployments, and auto-scaling groups that adjust resources in response to traffic spikes. This ensures that the system can handle varying workloads and provides continuous availability, even during high-demand periods. Cost-effectiveness: By eliminating the need for physical infrastructure and reducing ongoing server maintenance costs, Cloud-Based Secure File Transfer services offer significant cost savings compared to traditional on-premises services. Cloud providers typically offer pay-as-you-go pricing models, where users only pay for the resources they use, further optimizing costs. Security and compliance: Cloud-Based Secure File Transfer products offer strong security measures, including end-to-end encryption, key management, detailed logging, and auditing. These services are often compliant with industry regulations such as HIPAA (Health Insurance Portability and Accountability Act), GDPR (General Data Protection Regulation), and SOC 2 (System and Organization Controls), ensuring that data transfers meet necessary security and privacy standards. == Cloud-Based Secure File Transfer providers == == Uses == Cloud-Based Secure File Transfer is used across various industries to securely transfer sensitive data and integrate into business workflows. In healthcare, Cloud-Based Secure File Transfer is essential for securely transferring electronic Protected Health Information (ePHI), ensuring compliance with regulations like HIPAA. In financial institutions, it is used to protect sensitive financial data during transfer, maintaining privacy and security. Data analytics also benefits from Cloud-Based Secure File Transfer, offering a secure and efficient method for transferring large datasets between systems or partners. Technically, Cloud-Based Secure File Transfer is often integrated into enterprise workflows through automated file transfers, using scripting or APIs. It also plays a key role in cloud backup and disaster recovery, ensuring that files are securely transferred and stored in cloud environments, which supports business continuity. However, businesses must address certain implementation challenges. Despite its secure design, Cloud-Based Secure File Transfer is not immune to risks such as misconfigured SSH keys, improper access control, or inadequate encryption. Regular security audits and careful configuration management are necessary to minimize the risk of data breaches. Additionally, integrating Cloud-Based Secure File Transfer with legacy systems can present challenges, such as incompatible APIs or outdated authentication methods. == Comparisons with related technologies == Cloud-Based Secure File Transfer differs from traditional SFTP primarily in its deployment and management model. Traditional SFTP services are typically hosted on-premises or on virtual servers, requiring manual configuration, ongoing infrastructure maintenance, and security management by in-house IT teams. In contrast, Cloud-Based Secure File Transfer is offered as a Software-as-a-Service (SaaS) service, reducing infrastructure overhead by eliminating the need for dedicated hardware or virtual machines. This model simplifies management through centralized web-based interfaces, automated updates, and built-in scalability. While Cloud-Based Secure File Transfer is focused on providing secure file transfers over the SFTP protocol, Managed File Transfer (MFT) platforms generally support a broader range of protocols, including FTP, FTPS, HTTP/S, and AS2. MFT services often include advanced features such as end-to-end encryption, extensive automation, compliance reporting, and integration with enterprise systems. Cloud-Based Secure File Transfer services may offer some of these features but are typically more lightweight and streamlined, targeting organizations seeking a secure and scalable alternative to traditional SFTP without the full suite of MFT capabilities. As such, Cloud-Based Secure File Transfer can be seen as a specialized subset within the broader managed file transfer ecosystem.

    Read more →
  • GCube system

    GCube system

    gCube is an open source software system specifically designed and developed to enact the building and operation of a Data Infrastructure providing their users with a rich array of services suitable for supporting the co-creation of Virtual Research Environments and promoting the implementation of open science workflows and practices. It is at the heart of the D4Science Data Infrastructure. == Overview == It is primarily organised in a number of web service called to offer functionality supporting the phases of knowledge production and sharing. In addition, it consists of a set of software libraries supporting service development, service-to-service integration, and service capabilities extension, and a set of portlets dedicated to realise user interface constituents facilitating the exploitation of one or more services. It is designed and conceived to enact system of systems. In fact, its gCube services rely on standards and mediators to interact with other services as well as are made available by standard and APIs to make it possible for clients to use them. For instance, the DataMiner service implements the Web Processing Service protocol to facilitate clients to execute processes. The set of components dealing with Identity and Access Management rely on Keycloak and federates other IDMs thus making the overall Authentication and the Authorization management compliant with open standards such as OAuth2, User-Managed Access (UMA), and OpenID Connect (OIDC)protocols. The Catalogue relies on DCAT, OAI-PMH, and Catalogue Service for the Web to collect contents from other catalogues and data sources and offers its content by DCAT, OAI-PMH, and a proprietary REST API (gCat REST API). Its Continuous Integration/Continuous Delivery pipeline implemented by Jenkins represents an innovative approach to software delivering conceived to be scalable and easy to maintain and upgrade at a minimal cost. == History == gCube has been developed in the context of the D4Science initiative with the support of several EU projects.

    Read more →