skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: ISAIC-MS-1171 DATA: Digital Archiving and Transformed Analytics -- Applied Machine Learning in feasible, analytical, scalable and testable approach
Abstract. As cloud-based web services get more and more capable, available, and powerful (CAP), data science and engineering is pulled toward the frontline because DATA means almost anything-as-a-service (XaaS) via Digital Archiving and Transformed Analytics. In general, a web service (via a website) serves customers with web documents in HTML, JSON, XML, and multimedia via interactive (request) and responsive (reply) ways for specific domain problem solving over the Internet. In particular, a web service is deeply involved with UI & UX (user interface and user experience) plus considerate regulations on QoS (Quality of Service) as well, which refers to both information synthesis and security, namely availability and reliability for providential web services. This paper, based on the novel wiseCIO as a Platform-as-a-Service (PaaS), presents digital archiving 3 and transformed analytics (DATA) via machine learning, one of the most practical aspects of artificial intelligence. Machine learning is the science of data analysis that automates analytical model building and online analytical processing (OLAP) that enables computers to act without being explicitly programmed through CTMP. Computational thinking combined with manageable processing is 4 thoroughly discussed and utilized for FAST solutions in a feasible, analytical, scalable and testable approach. DATA is central to information synthesis and analytics (ISA), and digitized archives plays a key role in transformed analytics on intelligence for business, education and entertainment (iBEE). Case studies as applicable examples are discussed over broad fields where archival digitization is required for analytical transformation via machine learning, such as scalable ARM (archival repository for manageable accessibility), visual BUS (biological understanding from STEM), schooling DIGIA (digital intelligence governing instruction and administering), viewable HARP (historical archives & religious preachings), vivid MATH (mathematical apps in teaching and hands-on exercise), and SHARE (studies via hands-on assignment, revision and evaluation). As a result, wiseCIO promotes DATA service by providing ubiquitous web services of analytical processing via universal interface and user-centric experience in favor of logical organization of web content and relational information groupings that are vital steps in the ability of an archivist or librarian to recommend and retrieve information for a researcher. More important, wiseCIO also plays a key role as a content management system and delivery platform with capacity of hosting 10,000+ traditional web pages with great ease.  more » « less
Award ID(s):
2011938
PAR ID:
10321282
Author(s) / Creator(s):
; ; ; ;
Date Published:
Journal Name:
ISAIC 2020
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Arai, K. (Ed.)
    Integral digitalization aims to liaise with Universal interface for human-computer interaction, assemble Brewing aggregation via online analytical processing, and engage Centered user experience (UBC), which enables wiseCIO to orchestrate “Anything-as-a-Service” (XaaS). This paper presents three important concepts such as iDATA, iDEA and ACTiVE that together orchestrate XaaS on wiseCIO. iDATA stands for “integral digitalization via archival transformation and analytics” in support of content management, iDEA denotes “intelligence-driven efficient automation” for UBC processing with little coding required via machine learning automata, and ACTiVE represents “accessible, contextual and traceable information for vast engagement” with content delivery. Where iDATA is central to XaaS through computational thinking applied to multidimensional online analytical processing (mOLAP). Case studies are through discussed on the massive basis through iDATA over broad fields, such as manageable ARM (archival repository for manageable accessibility), animated BUS (biological understanding from STEM), sensible DASH (deliveries assembled for fast search & hits), smart DIGIA (digital intelligence governing instruction and administering), informative HARP (historical archives & religious preachings), vivid MATH (mathematical apps in teaching and hands-on exercise), and engaging SHARE (studies via hands-on assignment, review/revision and evaluation). As a result, iDATA-orchestrated wiseCIO is in favor of archival content management (ACM) and massive content delivery (MCD). Most recently, the comprehensive online teaching and learning (COTL) has been prepared and published as ACTiVE courseware with various multimedia and the student online profiles for paperless homework, labs and submissions. The ACTiVE courseware is integrated with a capacity equivalent to 10,000 + traditional web pages and broadly used for advanced remote learning (ARL) in both synchronous model and asynchronous model with great ease. 
    more » « less
  2. Software developers are increasingly having conversations about software development via online chat services. Many of those chat communications contain valuable information, such as code descriptions, good programming practices, and causes of common errors/exceptions. However, the nature of chat community content is transient, as opposed to the archival nature of other developer communications such as email, bug reports and Q&A forums. As a result, important information and advice are lost over time. The focus of this dissertation is Extracting Archival Information from Software-Related Chats, specifically to (1) automatically identify conversations which contain archival-quality information, (2) accurately reduce the granularity of the information reported as archival information, and (3) conduct a case study to investigate how archival quality information extracted from chats compare to related posts in Q&A forums. Archiving knowledge from developer chats that could be used potentially in several applications such as: creating a new archival mechanism available to a given chat community, augmenting Q&A forums, or facilitating the mining of specific information and improving software maintenance tools. 
    more » « less
  3. A set of Information Assurance and Security hands-on learning modules is developed and open to the public. Topics include networking security, database security, defensive programming, web security, system fundamentals, mobile security, malware detection using Machine learning, and big data analytics on network intrusion detection. The design follows hands-on casebased pedagogical model, which yields a satisfaction rate up to 92.5% for self-learners. 
    more » « less
  4. The exponential growth of digital content has generated massive textual datasets, necessitating the use of advanced analytical approaches. Large Language Models (LLMs) have emerged as tools that are capable of processing and extracting insights from massive unstructured textual datasets. However, how to leverage LLMs for text analytics Information Systems (IS) research is currently unclear. To assist the IS community in understanding how to operationalize LLMs, we propose a Text Analytics for Information Systems Research (TAISR) framework. Our proposed framework provides detailed recommendations grounded in IS and LLM literature on how to conduct meaningful text analytics IS research for design science, behavioral, and econometric streams. We conducted three business intelligence case studies using our TAISR framework to demonstrate its application in several IS research contexts. We also outline the potential challenges and limitations of adopting LLMs for IS. By offering a systematic approach and evidence of its utility, our TAISR framework contributes to future IS research streams looking to incorporate powerful LLMs for text analytics. 
    more » « less
  5. With the rapid development of the Internet of Things (IoT) and Big Data infrastructure, crowdsourcing techniques have emerged to facilitate data processing and problem solving particularly for flood emergences purposes. A Flood Analytics Information System (FAIS) has been developed as a Python Web application to gather Big Data from multiple servers and analyze flooding impacts during historical and real-time events. The application is smartly designed to integrate crowd intelligence, machine learning (ML), and natural language processing of tweets to provide flood warning with the aim to improve situational awareness for flood risk management. FAIS, a national scale prototype, combines flood peak rates and river level information with geotagged tweets to identify a dynamic set of at-risk locations to flooding. The prototype was successfully tested in real-time during Hurricane Dorian flooding as well as for historical event (Hurricanes Florence) across the Carolinas, USA where the storm made extensive disruption to infrastructure and communities. 
    more » « less