This is the web page for Text Analytics at the University of Oklahoma.
Class hours: Tuesday/Thursday 1:30 - 2:45pm
Carson Engr Ctr 0438
Live lectures and videos will be posted.
Final exam:
May 9, 2022
M 1:30 - 3:30 pm
Carson Engr Ctr 0438
Dr. Christan Grant
Jasmine DeHart
Note: Any email messages to the professors or teaching assistants must include `cs5293`, one word, lowercase, in the subject line. Any email without this string in the subject line will likely be filtered as junk.
Students must have a working understanding of statistics and data structures, in addition to a select set of software skills. The prerequisite courses/skills are listed here: - Statistics: MATH 4743 or MATH 4753 or ISE 3293 or ISE 5013 - Data Structures/Discrete Structures: CS 5005 or CS 2413 and CS 2813 - Software skills: Students should be well versed in Java and/or C++ and also familiar with at least one scripting language such as Python. Students should also be comfortable working with the GNU/Linux command line.
Undergraduate students with a 3.5 GPA or higher may enroll with permission from the instructor.
Text Analytics are the methods and techniques used to extract useful knowledge from text to support decision making. This field includes a collection of research from the natural language processing, databases, data mining, and machine learning communities. The aim of this course is to be a primer for text analytics theory and practice. After taking this course, students will have an understanding of how to independently obtain, parse, and analyze textual information for organizations that want to extract valuable insights.
Topics discussed in the course include: obtaining data sets, understanding data formats, duplicate detection, cleaning data sets, tagging, indexing and search, evaluating algorithms, classification, clustering, topic modeling and entity resolution. Time permitting, we may discuss advanced topic such as relation extraction, slot filling, knowledge graphs, knowledge base construction, the semantic web, question answering or other cutting-edge topics.
Each week will be split weekly. Lectures will be a mix of traditional lectures, class discussions, videos and other activities. Participation is required to get the most out of the class.
We will use the Canvas learning system. This course website can be reached through canvas.ou.edu. Please check this system regularly to keep informed of all announcements, updates, and changes. Important course information will also be distributed through the course website.
Suggested Textbooks:
Increasingly, software is developed and executed in “the cloud”. This semester the class will make heavy use of a popular cloud infrastructure. Students will be able to deploy virtual machines with various configurations, on the fly. Credentials for using this infrastructure will be distributed after the first week of class. For questions and issues using this software, students should use the in-class discussion board. All students enrolled in class should also have a CS account and access to a Linux-based systems in the CS department. For most computer science students, an account will be automatically created. All code written for this course MUST run using the compilers or interpreters that will be specified for the assignments. It is your responsibility to ensure that your code runs on these systems. For compatibility reasons, we recommend developing and testing on a Linux-based machine.
Readings: Each week several resources are posted. You are responsible for their contents. You are expected to read the materials before class.
Laptop Computers: It is the responsibility of each student in this class to have a working laptop computer with ample battery (at least 2 hours of life under moderate usage) and wireless Internet connectivity. If meeting in person, you must bring the laptop computer to class. If your computer requires repair during the semester, it is your responsibility to make arrangements to have another computer available and to get the necessary software installed. There exist campus resources (including financial help) to repair broken computers; please see the instructors if you would like information about these programs. Note that temporarily borrowing a computer from a fellow student in the class can present a number of problems, including the potential for academic misconduct. You should backup all code and data in multiple locations.
Newsgroups and Email: The newsgroup on Canvas should be the primary method of communication (outside of class). This allows everyone in the class to benefit from the answer to your question, and provides students with more timely answers since the TAs and instructors check Canvas at least once a day. Matters of personal interest should be directed to email instead of to the newsgroup, e.g. informing the instructors of an extended personal illness.
Incompletes: The grade of “I” is intended for the rare circumstance when a student who has been successful in a class has an unexpected event occur shortly before the end of the class. We will not consider giving a student a grade of “I” unless the following three conditions have been met:
Religious Holidays: It is the policy of the University to excuse the absences of students that result from religious observances and to provide without penalty for the rescheduling of examinations and additional required classwork that may fall on religious holidays.
Classroom Conduct: Because cell phones and laptops can distract substantially from the classroom experience, students are asked not to use either during class, except in cases in which they are required as part of a classroom exercise. Disruptions of class will also not be permitted. In the case of disruptive behavior, we may ask that you leave the classroom and may charge you with a violation of the Student Code of Responsibilities and Conduct. Examples of disruptive behavior include:
All students are expected to wash their hands, social distance, and wear a mask according to school, state, and federal guidelines. It is my goal to be extremely considerate to personal and family situation, but please alert me to any problems you may be facing as early as possible. That way we can ensure you are able to Although the course will be online, please review the OU covid policies https://www.ou.edu/coronavirus. The university has also posted a corona virus question-answer page.
Feel free to discuss all assignments with the instructors or the TAs.
Quizzes, Exams, In-Class Exercises: unless otherwise stated, you may not communicate with others about solutions to these assignments.
Make sure that your computer account is properly protected. Use an appropriate password, and do not give your friends access to your account or your computer system. Do not leave printouts, computers or thumb drives around a laboratory where others might access them.
Programming projects will be checked by software designed to detect collaboration. This software is extremely effective and has withstood repeated reviews by the campus judicial processes.
Points for this class will come from a variety of sources. The different components are weighted as follows:
Percentage | |
---|---|
Activities | 35% |
Discussions | 30% |
Projects | 35% |
100% |
Activities will be assigned approximately every week. These may be coding assignments, essay questions, or other activities.
Discussion topics will be assigned regularly. The discussions will take place on the canvas discussion boards but students may also comment on the discussion topic in class to receive credit. Discussion may ask your oppinion on a topic, or they could ask you to performa a task and “report back.”
Two to four Projects will be given over the course of the semester. These project will require a substantial amount of planning, programming and debugging. We encourage you to budget your time well for these.
Unless otherwise specified, for written student submissions, should only be .txt
files, portable document format .pdf
, or Markdown .md
.
Files of type .doc
, .docx
, or .rtf
will not be accepted.
Compressed files should be of type .gz
or .tar.gz
.
Files of the .rar
format will not be accepted.
Other file types, particularly coding files, may be used in the class.
The expected file type will be stated.
Often, files packaged under non-Unix/Linux flavored operating systems, such as
Windows, have a non-negative number of compatibility issues with our grading systems.
If the graders cannot open files for these reasons, the assignment will not receive credit.
Late policies are often at odds with the ability for students to receive feedback. I strongly encourage all students to submit assignments at the posted due date. If assignments are not completed on time, the frequent assignments will mount up and the amount of work that can be due may be in surmountable. The policy is that all assignments must be completed before they are graded. This typically means that students will have 3-7 days to complete the assignment. If as long as a student submit the assignment we will not take off extra points. However, the grading time will not be announced and we will not accept assignments after the deadline.
Grade cut-offs will be at or below the traditional 90, 80, 70, etc. cut-offs.
Please note that when an exam/assignment is brought with grading questions, we may examine the entire exam/assignment and your final grade may end up lower.
Canvas has a grade book that is used to store the data that are used to calculate your course grade. It is the responsibility of each student in this class to check their grades on Canvas after each assignment is returned. If an error is found, bring the graded document to any of the instructors or TAs, and we will correct Canvas.
Situation | Integrity Violation? |
---|---|
Students A and B meet and work on their assignment together. Neither student prepared anything in advance and the resulting work is identical. | Yes |
Students A and B create drafts of their assignment independently and get together to compare answers and discuss their understanding of the material. Each person decides independently whether to make changes that are discussed. | No |
Students A and B agree to prepare drafts of their assignments independently, but only Student A does. Student A shares her draft to Student B who reviews it and offers suggestions for improvement. | Yes |
Students A and B agree that student A will work the even problems and Student B will work the odd problems. They share their work. | Yes |
Students A and B agree that student A will work on a read function and Student B will work the sorting function. They share their solutions. | Yes |
Student A has completed a project and is helping Student B complete the same project. Student A explains to Student B what student B’s code actually does, which is different than what Student B thinks the code does. Student B determines how to modify the code independently. | No |
Student A has completed a project and is helping Student B complete the same project. Student B is having trouble getting one part of the program to work, so Student A texts Student B three lines of their solution. | Yes |
Student A has completed a project and is helping Student B complete the same project. Student B is having difficulty getting the program to work, so student A tells student B exactly what to type for several lines. | Yes |
Student A has completed a project and is helping student B complete the same project. Student B is having difficulty getting the program to work, so Student A suggests that Student B use a specific debugging strategy (e.g. “Print out the contents of the variable”). | No |
Student A has completed a project and is helping Student B complete the same project. Student A shows Student B an example program in the online textbook that will be helpful in figuring out the solution to the problem. | No |
Student A publishes solutions to an assignment on a public Internet page. | Yes |
Students A and B work on a project together. After they have finished it, student A takes the code and modifies it so the programs do not appear to be identical. | Yes |
Student A copy and pastes code from a public Internet page but changes the variable names. | Yes |
By the end of the semester, the students will increase their:
The College of Engineering utilizes student ratings as one of the bases for evaluating the teaching effectiveness of each of its faculty members. The results of these forms are important data used in the process of awarding tenure, making promotions, and giving salary increases. In addition, the faculty uses these forms to improve their own teaching effectiveness. The original request for the use of these forms came from students, and it is students who eventually benefit most from their use. Please take this task seriously and respond as honestly and precisely as possible, both to the machine-scored items and to the open-ended questions.
The University of Oklahoma is committed to providing reasonable accommodation for all students with disabilities. Students with disabilities who require accommodations in this course are requested to speak with the professor as early in the semester as possible. Students with disabilities must be registered with the Office of Disability Services prior to receiving accommodations in this course. The Office of Disability Services is located in the University Community Center at 730 College Avenue; the phone is 405-325-3852 or TDD only is 403-325-4173.
Should you need modifications or adjustments to your course requirements because of documented pregnancy-related or childbirth-related issues, please contact me as soon as possible to discuss. Generally, modifications will be made where medically necessary and similar in scope to accommodations based on temporary disability. Please see http://www.ou.edu/eoo/faqs/pregnancy-faqs.html for commonly asked questions.
For any concerns regarding gender-based discrimination, sexual harassment, sexual misconduct, stalking, or intimate partner violence, the University offers a variety of resources, including advocates on-call 24.7, counseling services, mutual no contact orders, scheduling adjustments and disciplinary sanctions against the perpetrator. Please contact the Sexual Misconduct Office 405-325-2215 (8-5, M-F) or OU Advocates 405-615-0013 (24.7) to learn more or to report an incident.
During an emergency, there are official university procedures that will maximize your safety.
If you receive an OU Alert to seek refuge or hear a tornado siren that signals severe weather
Link to Severe Weather Refuge Areas
Severe Weather Preparedness - Video
If you receive an OU Alert to shelter-in-place due to an active shooter or armed intruder situation or you hear what you perceive to be gunshots:
For more information, visit http://www.ou.edu/emergencypreparedness.html
Shots Fired on Campus Procedure - Video
If you receive an OU Alert that there is danger inside or near the building, or the fire alarm inside the building activates:
For OU IT support, please phone (405) 325-HELP. For help with issues pertaining to any CS department machines (in room DEH 115).
This syllabus is subject to change. Students are responsible for any changes/additions to this syllabus announced during the semester.
Long before the University of Oklahoma was established, the land on which the University now resides was the traditional home of the “Hasinais” Caddo Nation and “Kirikirʔi:s” Wichita & Affiliated Tribes.
We acknowledge this territory once also served as a hunting ground, trade exchange point, and migration route for the Apache, Comanche, Kiowa and Osage nations.
Today, 39 tribal nations dwell in the state of Oklahoma as a result of settler and colonial policies that were designed to assimilate Native people.
The University of Oklahoma recognizes the historical connection our university has with its indigenous community. We acknowledge, honor and respect the diverse Indigenous peoples connected to this land. We fully recognize, support and advocate for the sovereign rights of all of Oklahoma’s 39 tribal nations. This acknowledgement is aligned with our university’s core value of creating a diverse and inclusive community. It is an institutional responsibility to recognize and acknowledge the people, culture and history that make up our entire OU Community.
htop
explainedThis page is available online at: https://oudalab.github.io/cs5293sp22/syllabus